ERWIN KREYSZIG
ADVANCED
ENGINEERING
••••
MATHEMATICS
••••
n
.,:
I
.
' . ,
~.;...
<,.
.••
..
'.
•
..
,~.
.'~~~
.,.L......·
-_.,
. ::: .'
L •
."
~.::::----....
• ,
""
....
"
'"
:'
'..
"
.. -
I'
1,1
11
't.
"'11 ~ r
.
i
:,t::a'. ..
'..
:,., ,-" ""',
~;.
;' ,-:~.;;. ~.~: ,.". "':.'. '-,".~..
;
,.....
....
f
'~":~"_
;11,",;
.
• - _..
."
"'-;:-"'"
'_:
.
.
,;~"
",. . .
',-'
,
:":~:~,:..
... .
,,
,
.
-...
~~
4'
~:-~"'" . ~,'~..
fif I ••
;;~:, '. " __~~ ,INTERNATIONAL
~
'; l\. .
.. ...
• '';
'I It.
. la.· '''.:'"
'-....- ...
......
"-.:lJ."
. . '~ \; i, ."
•
"
"..-:. ;'1
... l- ' \
.~.
9TH EDITION
•
o
•• e.
•
~;.
..
,....
I
••
..
RESTRICTED!
Not for Sale in
.~ the United States
.. "'"
....\t'~ ,.~.
. ~
'.. :. \ .
Advanced
Engineering
Mathematics
EDITION
Advanced
Engineering
Mathematics
ERWIN KREYSZIG
Professor of Mathematics
Ohio State University
Columbus, Ohio
@
WILEY
JOHN WILEY & SONS, INC.
Vice President and Publisher: Laurie Rosatone
Editorial Assistant: Daniel Grace
Associate Production Director: Lucille Buonocore
Senior Production Editor: Ken Santor
Media Editor: Stefanie Liebman
Cover Designer: Madelyn Lesure
Cover Photo: © John Sohm/ChromosohmJPhoto Researchers
This book was set in Times Roman by GGS Information Services
Copyright © 2006 John Wiley & Sons. Inc. All rights reserved.
No part of this publication may be reproduced, stored in a retrieval system or transmitted in any
form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise,
except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act. without
either the prior written permission of the Publisher, or authorization through payment of the
appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA
01923, (508) 750-8400, fax (508) 750-4470. Requests to the Publisher for permission should be
addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ
07030, (201) 748-6011, fax (201) 748-6008, E-Mail: PERMREQ@WILEY.COM.
Kreyszig, Erwin.
Advanced engineering mathematics I Erwin Kreyszig.~9th ed.
p.
cm.
Accompanied by instructor's manual.
Includes bibliographical references and index.
1. Mathematical physics. 2. Engineering mathematics. 1. Title.
ISBN-13: 978-0-471-72897-9
ISBN-10: 0-471-72897-7
Printed in Singapore
10
9
8
7
6
5
4
3
2
PREFACE
See also http://www.wiley.com/coUege/kreyszig/
Goal of the Book. Arrangement of Material
This new edition continues the tradition of providing instmctors and students with a
comprehensive and up-to-date resource for teaching and learning engineering
mathematics, that is, applied mathematics for engineers and physicists, mathematicians
and computer scientists, as well as members of other disciplines. A course in elementary
calculus is the sole prerequisite.
The subject matter is arranged into seven parts A-G:
A Ordinary Differential Equations (ODEs) (Chaps. 1-6)
B Linear Algebra. Vector Calculus (Chaps. 7-9)
C Fourier Analysis. Partial Differential Equations (PDEs) (Chaps. 11-12)
D Complex Analysis (Chaps. 13-18)
E Numeric Analysis (Chaps. 19-21)
F Optimization, Graphs (Chaps. 22-23)
G Probability, Statistics (Chaps. 24-25).
This is followed by five appendices:
App.
App.
App.
App.
App.
I References (ordered by parts)
2
3
4
5
Answers to Odd-Numbered Problems
Auxiliary Material (see also inside covers)
Additional Proofs
Tables of Functions.
This book has helped to pave the way for the present development of engineering
mathematics. By a modern approach to those areas A-G, this new edition will prepare
the student for the tasks of the present and of the future. The latter can be predicted to
some extent by a judicious look at the present trend. Among other features, this trend
shows the appearance of more complex production processes, more extreme physical
conditions (in space travel, high-speed communication, etc.), and new tasks in robotics
and communication systems (e.g., fiber optics and scan statistics on random graphs) and
elsewhere. This requires the refinement of existing methods and the creation of new ones.
It follows that students need solid knowledge of basic principles. methods, and results,
and a clear view of what engineering mathematics is all about, and that it requires
proficiency in all three phases of problem solving:
• Modeling, that is, translating a physical or other problem into a mathematical form,
into a mathematical model; this can be an algebraic equation, a differential equation,
a graph, or some other mathemalical expression.
• Solving the model by selecting and applying a suitable mathematical method, often
requiring numeric work on a computer.
• Interpreting the mathematical result in physical or other terms to see what it
practically means and implies.
It would make no sense to overload students with all kinds of little things that might be of
occasional use. Instead they should recognize that mathematics rests on relatively few basic
concepts and involves powerful unifying principles. This should give them a firm grasp on
the illterrelations amollg theory, computing, and (physical or other) experimentation.
v
Preface
vi
PART A
PART B
Chaps. 1-6
Ordinary Differential Equations (ODEs)
Chaps. 7-10
Linear Algebra. Vector Calculus
Chaps. 1-4
Basic Material
I
I t
l
Chap. 5
Series Solutions
Chap. 6
t
Laplace Transforms
I
t
Chap. 7
Matrices,
Linear Systems
Chap. 9
Vector Differential
Calculus
t Chap. 8
Eigenvalue Problems
Chap. 10
Vector Integral Calculus
t
PARTe
PARTD
Chaps. 11-12
Fourier Analysis. Partial Differential
Equations (PDEs)
Chaps. 13-18
Complex Analysis,
Potential Theory
Chap. 11
Fourier Analysis
Chaps. 13-17
Basic Material
Chap. 12
Partial Differential Equations
Chap. 19
Numerics in
General
,
Chap. 18
Potential Theory
PART E
PART F
Chaps. 19-21
Numeric Analysis
Chaps. 22-23
Optimization, Graphs
Chap. 20
Numeric
Linear Algebra
Chap. 21
Numerics for
ODEs and PDEs
Chap. 22
Linear Programming
Chap. 23
Graphs, Optimization
PARTG
GUIDES AND MANUALS
Chaps. 24-25
Probability, Statistics
Maple Computer Guide
Mathematica Computer Guide
Chap. 24
Data Analysis. Probability Theory
Student Solutions Manual
1- - - 1 - - - - - - - - - - - - - - - - - /
Chap. 25
Mathematical Statistics
Instructor's Manual
Preface
vii
General Features of the Book Include:
• Simplicity of examples, to make the book teachable-why choose complicated
examples when simple ones are as instructive or even better?
• Independence of chapters, to provide flexibility in tailoring courses to special needs.
• Self-contained presentation, except for a few clearly marked places where a proof
would exceed the level of the book and a reference is given instead.
• Modern standard notation, to help students with other courses, modern books, and
mathematical and engineering journals.
Many sections were rewritten in a more detailed fashion. to make it a simpler book. This
also resulted in a better balance between theory and applicatio1ls.
Use of Computers
The presentation is adaptable to various levels of technology and use of a computer or
graphing calculator: very little or no use, medium u~e, or intensive use of a graphing
calculator or of an unspecified CAS (Computer Algebra System, Maple, Mathematica,
or Matlab being popular examples). In either case texts and problem sets form an entity
without gaps or jumps. And many problems can be solved by hand or with a computer
or both ways. (For software, see the beginnings of Part E on Numeric Analysis and Part G
on Probability and Statistics.)
More specifically, this new edition on the one hand gives more prominence to tasks
the computer cannot do, notably, modeling and interpreting results. On the other hand, it
includes CAS projects. CAS problems. and CAS experimellts, which do require a
computer and show its power in solving problems that are difficult or impossible to access
otherwise. Here our goal is the combination of intelligent computer use with high-quality
mathematics. This has resulted in a change from a formula-centered teaching and learning
of engineering mathematics to a more quantitative, project-oriented, and visual approach.
CAS experiments also exhibit the computer as an instrument for observations and
experimentations that may become the beginnings of new research, for "proving" or
disproving conjectures, or for formalizing empirical relationships that are often quite useful
to the engineer as working guidelines. These changes will also help the student in
discovering the experimental aspect of modern applied mathematics.
Some routille and drill work is retained as a necessity for keeping firm contact with
the subject matter. In some of it the computer can (but must not) give the student a hand,
but there are plenty of problems that are more suitable for pencil-and-paper work.
Major Changes
1. New Problem Sets. Modem engineering mathematics is mostly teamwork. [t usually
combines analytic work in the process of modeling and the use of computer algebra and
numeric~ in the process of solution, followed by critical evaluation of results. Our
problems-some straightforward. some more challenging, some "thinking problems" not
acce~sible by a CAS, some open-ended-reflect this modem situation with its increased
emphasis on qualitative methods and applications, and the problem sets take care of this
novel situation by including team projects, CAS projects, and writing projects. The latter
will also help the student in writing general reports, as they are required in engineering
work quite frequently.
2. Computer Experiments. using the computer as an instrument of "experimental
mathematics" for exploration and research (see also above). These are mostly open-ended
viii
Preface
experiments, demonstrating the use of computers in experimentally finding results. which
may be provable afterward or may be valuable heuristic qualitative guidelines to the
engineer, in particular in complicated problems.
3. More on modeling and selecting methods, tasks that usually cannot be automated.
4. Student Solutions Manual and Study Guide enlarged, upon explicit requests
of the users. This Manual contains worked-out solutions to carefully selected odd-numbered
problems (to which App. I gives only the final answers) as well as general comments
and hints on studying the text and working further problems, including explanations on
the significance and character of concepts and methods in the various sections of the
book.
Further Changes, New Features
• Electric circuits moved entirely to Chap. 2, to avoid duplication and repetition
• Second-order ODEs and Higher Order ODEs placed into two separate chapters
(2 and 3)
• In Chap. 2, applications presented before variation of parameters
• Series solutions somewhat shortened, without changing the order of sections
• Material on Laplace transforms brought into a better logical order: partial fractions
used earlier in a more practical approach, unit step and Dirac's delta put into separate
subsequent sections, differentiation and integration of transforms (not of functions!)
moved to a later section in favor of practically more important topics
• Second- and third-order determinants made into a separate section for reference
throughout the book
• Complex matrices made optional
• Three sections on curves and their application in mechanics combined in a single section
• First two sections on Fourier series combined to provide a better, more direct start
• Discrete and Fast Fourier Transforms included
• Conformal mapping presented in a separate chapter and enlarged
• Numeric analysis updated
• Backward Euler method included
• Stiffness of ODEs and systems discussed
• List of software (in Part E) updated; another list for statistics software added (in Part G)
• References updated, now including about 75 books published or reprinted after 1990
Suggestions for Courses: A Four-Semester Sequence
The material, when taken in sequence, is suitable for four consecutive semester courses,
meeting 3-4 hours a week:
1st Semester.
2nd Semester.
3rd Semester.
4th Semester.
ODEs (Chaps. 1-5 or 6)
Linear Algebra. Vector Analysis (Chaps. 7-10)
Complex Analysis (Chaps. 13-18)
Numeric Methods (Chaps. 19-21)
ix
Preface
Suggestions for Independent One-Semester Courses
The book is also suitable for various independent one-semester courses meeting 3 hours
a week. For instance:
Introduction to ODEs (Chaps. 1-2, Sec. 21.1)
Laplace Transforms (Chap. 6)
Matrices and Linear Systems (Chaps. 7-8)
Vector Algebra and Calculus (Chaps. 9-10)
Fourier Series and PDEs (Chaps. 11-12, Secs. 21.4-21.7)
Introduction to Complex Analysis (Chaps. 13-17)
Numeric Analysis (Chaps. 19, 21)
Numeric Linear Algebra (Chap. 20)
Optimization (Chaps. 22-23)
Graphs and Combinatorial Optimization (Chap. 23)
Probability and Statistics (Chaps. 24-25)
Acknowledgments
I am indebted to many of my former teachers, colleagues, and students who helped me
directly or indirectly in preparing this book, in particular, the present edition. I profited
greatly from discussions with engineers, physicists, mathematicians, and computer
scientists, and from their written comments. I want to mention particularly Y. Antipov,
D. N. Buechler, S. L. Campbell, R. CalT, P. L. Chambre, V. F. Connolly, Z. Davis, J. Delany,
J. W. Dettman, D. Dicker, L. D. Drager, D. Ellis, W. Fox, A. Goriely, R. B. Guenther,
J. B. Handley, N. Harbertson, A. Hassen, V. W. Howe, H. Kuhn, G. Lamb, M. T. Lusk,
H. B. Mann, I. Marx, K. Millet, J. D. Moore, W. D. Munroe, A. Nadim, B. S. Ng, J. N.
Ong. Jr.. D. Panagiotis, A. Plotkin. P. J. Pritchard, W. O. Ray, J. T. Scheick. L. F.
Shampine. H. A. Smith, J. Todd, H. Unz, A. L. Villone, H. J. Weiss, A. Wilansky, C. H.
Wilcox, H. Ya Fan, and A. D. Ziebur, all from the United States, Professors E. J.
Norminton and R. Vaillancourt from Canada, and Professors H. Florian and H. Unger
from Europe. I can offer here only an inadequate acknowledgment of my gratitude and
appreciation.
Special cordial thanks go to Privatdozent Dr. M. Kracht and to Mr. Herbert Kreyszig,
MBA, the coauthor of the Student Solutions Manual, who both checked the manuscript
in all details and made numerous suggestions for improvements and helped me proofread
the galley and page proofs.
Furthermore. I wish to thank John Wiley and Sons (see the list on p. iv) as well as GGS
InfolTllation Services, in particular Mr. K. Bradley and Mr. J. Nystrom. for their effective
cooperation and great care in preparing this new edition.
Suggestions of many readers worldwide were evaluated in preparing this edition.
Further comments and suggestions for improving the book will be gratefully received.
ERWIN KREYSZIG
CONTENTS
PART A
Ordinary Differential Equations (ODEs)
1
CHAPTER 1 First-Order ODEs
2
1.1
Basic Concepts. Modeling 2
1.2 Geometric Meaning of y' = f(x, y). Direction Fields 9
1.3
Separable ODEs. Modeling 12
1.4 Exact ODEs. Integrating Factors 19
1.5 Linear ODEs. Bernoulli Equation. Population Dynamics 26
1.6 Orthogonal Trajectories. Optional 35
1.7 Existence and Uniqueness of Solutions 37
Chapter 1 Review Questions and Problems 42
Summary of Chapter 1 43
CHAPTER 2 Second-Order Linear ODEs
45
2.1
Homogeneous Linear ODEs of Second Order 45
2.2 Homogeneous Linear ODEs with Constant Coefficients 53
2.3 Differential Operators. Optional 59
2.4 Modeling: Free Oscillations. (Mass-Spring System) 61
2.5 Euler-Cauchy Equations 69
2.6 Existence and Uniqueness of Solutions. Wronskian 73
2.7 Nonhomogeneous ODEs 78
2.8 Modeling: Forced Oscillations. Resonance 84
2.9 Modeling: Electric Circuits 91
2.10 Solution by Variation of Parameters 98
Chapter 2 Review Questions and Problems 102
Summary of Chapter 2 103
CHAPTER 3 Higher Order Linear ODEs
105
3.1
Homogeneous Linear ODEs 105
3.2 Homogeneous Linear ODEs with Constant Coefficients 111
3.3 Nonhomogeneous Linear ODEs 116
Chapter 3 Review Questions and Problems 122
Summary of Chapter 3 123
CHAPTER 4 5v stems of ODEs. Phase Plane. Qualitative Methods
4.0 Basics of Matrices and Vectors 124
4.1
Systems of ODEs as Models 130
4.2 Basic Theory of Systems of ODEs 136
4.3 Constant-Coefficient Systems. Phase Plane Method 139
4.4 Criteria for Critical Points. Stability 147
4.5 Qualitative Methods for Nonlinear Systems 151
4.6 Nonhomogeneous Linear Systems of ODEs 159
Chapter 4 Review Questions and Problems 163
Summary of Chapter 4 164
CHAPTER 5 Series Solutions of ODEs. Special Functions
5.1
Power Series Method 167
5.2 Theory of the Power Series Method 170
124
166
xi
xii
Contents
5.3 Legendre's Equation. Legendre Polynomials Pnex) 177
5.4 Frobenius Method 182
5.5 Bessel's Equation. Bessel Functions lvCx) 189
5.6 Bessel Functions of the Second Kind YvCx) 198
5.7 Sturm-Liouville Problems. Orthogonal Functions 203
5.8 Orthogonal Eigenfunction Expansions 210
Chapter 5 Review Questions and Problems 217
Summary of Chapter 5 218
CHAPTER 6 Laplace Transforms
220
6.1
Laplace Transform. Inverse Transform. Linearity. s-Shifting 221
6.2 Transforms of Derivatives and Integrals. ODEs 227
6.3 Unit Step Function. t-Shifting 233
6.4 Short Impulses. Dirac's Delta Function. Pm1ial Fractions 241
6.5 Convolution. Integral Equations 248
6.6 Differentiation and Integration of Transforms. 254
6.7 Systems of ODEs 258
6.8 Laplace Transform: General Formulas 264
6.9 Table of Laplace Transforms 265
Chapter 6 Review Questions and Problems 267
Summary of Chapter 6 269
PART B
Linear Algebra. Vector Calculus
CHAPTER 7
271
Linear Algebra: Matrices, Vectors, Determinants.
Linear Systems
272
7.1
Matrices, Vectors: Addition and Scalar Multiplication 272
7.2 Matrix Multiplication 278
7.3 Linear Systems of Equations. Gauss Elimination 287
7.4 Linear Independence. Rank of a Matrix. Vector Space 296
7.5
Solutions of Linear Systems: Existence, Uniqueness 302
7.6 For Reference: Second- and Third-Order Determinants 306
7.7 Determinants. Cramer's Rule 308
7.8 Inverse of a Matrix. Gauss-Jordan Elimination 315
7.9 Vector Spaces, Inner Product Spaces. Linear Transformations. Optional 323
Chapter 7 Review Questions and Problems 330
Summary of Chapter 7 331
CHAPTER 8 Linear Algebra: Matrix Eigenvalue Problems
8.1
Eigenvalues, Eigenvectors 334
8.2 Some Applications of Eigenvalue Problems 340
8.3 Symmetric, Skew-Symmetric, and Orthogonal Matrices 345
8.4 Eigenbases. Diagonalization. Quadratic Forms 349
8.5 Complex Matrices and Forms. Optional 356
Chapter 8 Review Questions and Problems 362
Summary of Chapter 8 363
333
Contents
xiii
CHAPTER 9 Vector Differential Calculus.
9.1
Vectors in 2-Space and 3-Space 364
9.2 Inner Product (Dot Product) 371
Grad, Div, Curl
364
9.3
Vector Product (Cross Product) 377
Vector and Scalar Functions and Fields. Derivatives 384
Curves. Arc Length. Curvature. Torsion 389
Calculus Review: Functions of Several Variables. Optional 400
Gradient of a Scalar Field. Directional Derivative 403
Divergence of a Vector Field 410
Curl of a Vector Field 414
Chapter 9 Review Questions and Problems 416
Summary of Chapter 9 417
9.4
9.5
9.6
9.7
9.8
9.9
CHAPTER 10 Vettor Integral Calculus. Integral Theorems
10.1 Line Integrals 420
10.2 Path Independence of Line Integrals 426
10.3 Calculus Review: Double Integrals. Optional 433
10.4 Green's Theorem in the Plane 439
10.5 Surfaces for Surface Integrals 445
10.6 Surface Integrals 449
10.7 Triple Integrals. Divergence Theorem of Gauss 458
10.8 Further Applications of the Divergence Theorem 463
10.9 Stokes's Theorem 468
Chapter 10 Review Questions and Problems 473
420
Summary of Chapter 10 474
PART C
Fourier Analysis. Partial Differential Equations (PDEs)
CHAPTER 11 Fourier Series, Integrals, and Transforms
11.1 Fourier Series 478
11.2 Functions of Any Period p = 2L 487
11.3 Even and Odd Functions. Half-Range Expansions 490
11.4 Complex Fourier Series. Optiollal 496
11.5 Forced Oscillations 499
11.6 Approximation by Trigonometric Polynomials 502
11.7 Fourier Integral 506
11.8 Fourier Cosine and Sine Transforms 513
11.9 Fourier Transform. Discrete and Fast Fourier Transforms
11.10 Tables of Transforms 529
Chapter 11 Review Questions and Problems 532
Summary of Chapter II 533
478
518
CHAPTER 12 Partial Differential Equations (PDEs)
535
12.1 Basic Concepts 535
12.2 Modeling: Vibrating String, Wave Equation 538
12.3 Solution by Separating Vatiables. Use of Fourier Series 540
12.4 D' Alembert's Solution of the Wave Equation. Characteristics 548
12.5 Heat Equation: Solution by Fourier Series 552
477
xiv
Contents
Heat Equation: Solution by Fourier Integrals and Transforms 562
Modeling: Membrane, Two-Dimensional Wave Equation 569
Rectangular Membrane. Double Fourier Series 571
Laplacian in Polar Coordinates. Circular Membrane. Fourier-Bessel Series 579
Laplace's Equation in Cylindrical and Spherical Coordinates. Potential 587
Solution of PDEs by Laplace Transforms 594
Chapter 12 Review Questions and Problems 597
Summary of Chapter 12 598
12.6
12.7
12.8
12.9
12.10
12.11
PART D
Complex Analysis 601
CHAPTER 13 Complex Numbers and Functions
602
13.1 Complex Numbers. Complex Plane 602
13.2 Polar Form of Complex Numbers. Powers and Roots 607
13.3 Derivative. Analytic Function 612
13.4 Cauchy-Riemann Equations. Laplace's Equation 618
13.5 Exponential Function 623
13.6 Trigonometric and Hyperbolic Functions 626
13.7 Logarithm. General Power 630
Chapter 13 Review Questions and Problems 634
Summary of Chapter 13 635
CHAPTER 14 Complex Integration
637
14.1 Line Integral in the Complex Plane 637
14.2 Cauchy's Integral Theorem 646
14.3 Cauchy's Integral Formula 654
14.4 Derivatives of Analytic Functions 658
Chapter 14 Review Questions and Problems 662
Summary of Chapter 14 663
CHAPTER 15 Power Series, Taylor Series
15.1 Sequences, Series, Convergence Tests 664
15.2 Power Series 673
15.3 Functions Given by Power Series 678
15.4 Taylor and Maclaurin Series 683
15.5 Uniform Convergence. Optional 691
Chapter 15 Review Questions and Problems 698
Summary of Chapter 15 699
664
CHAPTER 16 Laurent Series. Residue Integration
16.1 Laurent Series 701
16.2 Singularities and Zeros. Infinity 707
16.3 Residue Integration Method 712
16.4 Residue Integration of Real Integrals 718
Chapter 16 Review Questions and Problems 726
Summary of Chapter 16 727
CHAPTER 17 Conformal Mapning
728
17.1 Geometry of Analytic Functions: Conformal Mapping
17.2 Linear Fractional Transformations 734
17.3 Special Linear Fractional Transformations 737
701
729
Contents
xv
17.4 Conformal Mapping by Other Functions 742
17.5 Riemann Surfaces. Optional 746
Chapter 17 Review Questions and Prohlems 747
Summary of Chapter 17 748
CHAPTER 18 Coml')lex Analysis and Potential Theory
18.1 Electrostatic Fields 750
18.2 Use of Conformal Mapping. Modeling 754
18.3 Heat Problems 757
18.4 Ruid Flow 761
18.5 Poisson's Integral Formula for Potentials 768
18.6 General Properties of Harmonic Function.. 771
Chapter 18 Review Questions and Problems 775
Summary of Chapter 18 776
PART E
Numeric Analysis 777
Software 778
CHAPTER 19 Numerics in General
780
19.1 [ntroduction 780
19.2 Solution of Equations by Iteration 787
19.3 [nterpolation 797
19.4 Spline Interpolation 810
19.5 Numeric Integration and Differentiation 817
Chapter 19 Review Questions and Problems 830
Summary of Chapter 19 831
CHAPTER 20
Numeric Linear Algebra
833
20.1 Linear Systems: Gauss EliminatIOn 833
20.2 Linear Systems: LU-Factorization. Matrix Inversion 840
20.3 Linear Systems: Solution by Iteration 845
20.4 Linear Systems: III-Conditioning. Norms 851
20.5 Least Squares Method 859
20.6 Matrix Eigenvalue Problems: Introduction 863
20.7 [ncIusion of Matrix Eigenvalues 866
20.8 Power Method for Eigenvalues 872
20.9 Tridiagonalization and QR-Factorization 875
Chapter 20 Review Questions and Problems 883
Summary of Chapter 20 884
CHAPTER 21
Numerics for ODEs and PDEs
886
21.1 Methods for First-Order ODEs 886
21.2 Multistep Methods 898
21.3 Methods for Systems and Higher Order ODEs 902
21.4 Methods for Elliptic PDEs 909
21.5 Neumann and Mixed Problems. Inegular Boundary 917
21.6 Methods for Parabolic PDEs 922
21.7 Method for Hyperbolic PDEs 928
Chapter 21 Review Questions and Problems 930
Summary of Chapter 21 932
749
xvi
Contents
PART F
Optimization, Graphs 935
CHAPTER 22 Unconstrained Optimization. Linear
22.1 Basic Concepts. Unconstrained Optimization 936
22.2 Linear Programming 939
22.3 Simplex Method 944
22.4 Simplex Method: Difficulties 947
Chapter 22 Review Questions and Problems 952
Summary of Chapter 22 953
CHAPTER 23 Graphs. Combinatorial Optimization
23.1 Graphs and Digraphs 954
23.2 Shortest Path Problems. Complexity 959
23.3 Bellman's Principle. Dijkstra's Algorithm 963
23.4 Shortest Spanning Trees. Greedy Algorithm 966
23.5 Shortest Spanning Trees. Prim's Algorithm 970
23.6 Flows in Networks 973
23.7 Maximum Flow: Ford-Fulkerson Algorithm 979
23.8 Bipartite Graphs. Assignment Problem~ 982
Chapter 23 Review Questions and Problems 987
Summary of Chapter 23 989
PART G
Programming
954
Probability, Statistics 991
CHAPTER 24 Data Analysis. Probability Theory
993
24.1 Data Representation. Average. Spread 993
24.2 Experiments, Outcomes, Events 997
24.3 Probability 1000
24.4 Permutations and Combinations 1006
24.5 Random Variables. Probability Distributions 1010
24.6 Mean and Variance of a Distribution 1016
24.7 Binomial. Poisson, and Hypergeometric Distributions 1020
24.8 Normal Distribution 1026
24.9 Distributions of Several Random Variables 1032
Chapter 24 Review Questions and Problems 1041
Summary of Chapter 24 1042
CHAPTER 25 Mathematical Statistics
1044
25.1 Introduction. Random Sampling 1044
25.2 Point Estimation of Parameters 1046
25.3 Confidence Intervals 1049
25.4 Testing Hypotheses. Decisions 1058
25.5 Quality Control 1068
25.6 Acceptance Sampling 1073
25.7 Goodness of Fit. x2-Test 1076
25.8 Nonparametric Tests 1080
25.9 Regression. Fitting Straight Lines. Correlation 1083
Chapter 25 Review Questions and Problems 1092
Summary of Chapter 25 1093
936
xvii
Contents
APPENDIX 1
References
APPENDIX 2
Answers to Odd-Numbered Problems
Al
APPENDIX 3 Auxiliary Material
A60
A3.1 Formulas for Special Functions A60
A3.2 Partial Derivatives A66
A3.3 Sequences and Series A69
A3.4 Grad, Div, Curl, V 2 in Curvilinear Coordinates
APPENDIX 4
Additional Proofs
APPENDIX 5
Tables
PHOTO CREDITS
INDEX
11
A94
Pl
A74
A7l
A4
••••
PA R T
. ......... .,It ."
f
• •
-oj
'IIH'II
I
.. ..
..........
0.'-
,
.....
e•
I,:
"
A
Ordinary
Differential
Equations (ODEs)
C HAP T E R 1
First-Order ODEs
C HAP T E R 2
Second-Order Linear ODEs
C HAP T E R 3
Higher Order Linear ODEs
C HAP T E R 4
Systems of ODEs. Phase Plane. Qualitative Methods
C HAP T E R 5
Series Solutions of ODEs. Special Functions
C HAP T E R 6
Laplace Transforms
Differential equations are of basic importance in engineering mathematics because many
physical laws and relations appear mathematically in the form of a differential equation.
In Part A we shall consider various physical and geometric problems that lead to
differential equations, with emphasis on modeling, that is, the transition from the physical
situation to a "mathematical model." In this chapter the model will be a differential
equation, and as we proceed we shall explain the most important standard methods for
solving such equations.
Part A concerns ordinary differential equations (ODEs), whose unknown functions
depend on a single variable. Partial differential equations (PDEs), involving unknown
functions of several variables, follow in Part C.
ODEs are very well suited for computers. Numeric methods for ODEs call be studied
directly after Chaps. 1 or 2. See Sees. 21.1-21.3, which are independent of the other
sections on numerics.
I
9
t~
••••
I CHAPTER
~~
" I ..
:;....~.,..
.--
~
il;;ilil•
~
lillI,
--oj...
-I' -.~
"
1
I
:.
First-Order ODEs
In this chapter we begin our program of studying ordinary differential equations (ODEs)
by deriving them from physical or other problems (modeling), solving them by standard
methods, and interpreting solutions and their graphs in terms of a given problem. Questions
of existence and uniqueness of solutions will also be discussed (in Sec. l.7).
We begin with the simplest ODEs, called ODEs of the first order because they invol ve
only the first derivative of the unknown function, no higher derivatives. Our usual
notation for the unknown function will be y(x). or yet) if the independent variable is
time t.
If you wish, use your computer algebra system (CAS) for checking solutions, but make
sure that you gain a conceptual understanding of the basic terms, such as ODE, direction
field, and initial value problem.
COMMENT. Numerics for first-order ODEs call be studied immediately after this
chapter. See Secs. 2l.1-2l.2, which are independent of other sections on numerics.
Prerequisite: Integral calculus.
Sections that may be omitted in a shorter course: l.6, 1.7.
References and Answers to Problems: App. I Part A. and App. 2
1.1
Basic Concepts. Modeling
If we want to solve an engineering problem (usually of a physical nature). we first have
to formulate the problem as a mathematical expression in terms of variables, functions,
equations, and so forth. Such an expression is known as a mathematical model of the
given problem. The process of setting up a model, solving it mathematically, and
interpreting the result in physical or other terms is called mathematical modeling or, briefly,
modeling. We shall illustrate this process by various examples and problems because
modeling requires experience. (Your computer may help you in solving but hardly in
setting up models.)
Since many physical concepts, such as velocity and acceleration. are derivatives. a
model is very often an equation containing derivatives of an unknown function. Such
a model is called a differential equation. Of course, we then want to find a solution
(a function that satisfies the equation), explore its properties, graph it, find values of it,
and interpret it in physical terms so that we can understand the behavior of the physical
system in our given problem. However, before we can tum to methods of solution we
must first define basic concepts needed throughout this chapter.
SEC. 1.1
3
Basic Concepts. Modeling
Velocity
v
Water level h
Falling stone
Parachutist
y" = g = canst.
(Sec. 1.1)
mv'=mg-bv
Outflowing water
2
h'=-Il'fFt
(Sec. 1.2)
(Sec. 1.3)
y
" - ,\,
//i/ltl/ :\I
\
1\1
I;J
\
\
Displacement y
Vibrating mass
on a spring
my"+ky= 0
(Secs. 2.4, 2.8)
\
-'./
I
/
I
I
I'
I
t
I
\.
\
I
'l.. -- . /
Current I in an
RLC circuit
Beats of a vi brati ng
system
y" + liJ~y = cos rot, roo =
(Sec. 2.8)
Q)
LI" +RI' +lI=E'
C
(Sec. 2.9)
Lotka-Volterra
predator-prey model
Deformation of a beam
V
Pendulum
EI/ = f(x)
L8"+gsinB=O
(Sec. 3.3)
(Sec. 4.5)
Fig. 1.
y;= aY I - bY IY2
Y~ = kY 1Y 2 -IY 2
(Sec. 4.5)
Some applications of differential equations
4
CHAP. 1
First-Order ODEs
An ordinary differential equation (ODE) is an equation that contains one or several
derivatives of an unknown function, which we usually call y(x) (or sometimes yet) if the
independent variable is time t). The equation may also contain y itself, known functions
of x (or t), and constants. For example,
I
Y = cos x,
(1)
(2)
y"
+ 9)'
0,
=
(3)
are ordinary differential equations (ODEs). The term ordinary distinguishes them from
partial differelltial equations (PDEs), which involve partial derivatives of an unknown
function of two or more variables. For instance, a PDE with unknown function u of two
variables x and y is
PDEs are more complicated than ODEs; they will be considered in Chap. 12.
An ODE is said to be of order n if the 11th derivative of the unknown function \" is the
highest derivative of y in the equation. The concept of order gives a useful classification
into ODEs of first order, second order, and so on. Thus, (1) is of first order, (2) of second
order, and (3) of third order.
In this chapter we shall consider first-order ODEs. Such equations contain only the
first derivative y' and may contain y and any given functions of x. Hence we can write
them as
(4)
F(x, y, y')
= 0
or often in the form
y' = f(x. \').
This is called the explicit fonn. in contrast with the implicit form (4). For instance, the
implicit ODE x- 3 y' - 4y2 = 0 (where x *- 0) can be written explicitly as y' = 4x 3 )'2.
Concept of Solution
A function
y = hex)
is called a solution of a given ODE (4) on some open interval a < x < b if h(-r) is defined
and differentiable throughout the interval and is such that the equation becomes an identity
if y and)' are replaced with hand h', respectively. The curve (the graph) of h is called
a solution curve.
Here, open interval a < x < b means that the endpoints a and b are not regarded as
points belonging to the interval. Also, a < x < b includes infinite intervals -00 < x < b,
a < x < 00, -00 < x < 00 (the real line) as special cases.
I
SEC. 1.1
5
Basic Concepts. Modeling
E X AMP L E 1
Verification of Solution
=
y' =
y
E X AMP L E 2
*"
hex) = c/x tc an arbitrary constant, x
0) is a solution of Xl" = -y. To verify this. differentiate,
h' (x) = -c/x2 , and multiply by x to get xy' = -c/x = -y. Thus, xy' = -y, the given ODE.
•
Solution Curves
The ODE y' = dyldx = cos x can be solved directly by integration on both sides. Indeed. using calculus. we
obtain y = f cos x dx = sin x + c, where c is an arbitrary constant. This is afamily of solutions. Each value
of c, for instance. 2.75 or 0 or -8. gives one of these curves. Figure 2 shows some of them. for c = -3. -2,
-1,0, 1,2.3.4.
•
y
Solutions y = sin x
Fig. 2.
EXAMPLE 3
+ c of the ODE y'
cos x
=
Exponential Growth, Exponential Decay
From calculu~ we know that y = ce 3t (c any constant) has the derivative (chain rule!)
,
y =
dv
dl
= 3ce
3t
= 3y.
This shows that y is a solution of y' = 3)'. Hence this ODE can model exponential growth, for instance. of
animal populations or colonies of bacteria. It also applies to humans for small population~ in a large country
(e.g .. the United States in early times) and is then known as Malt/illS's law. I We shall say more about this topic
in Sec. 1.5.
Similarly. y' = -O.2y (with a minus on the right!) has the solution y = ce- O.2t . Hence this ODE models
exponential decay, for instance. of a radioactive substance (see Example 5). Figure 3 shows solutions for some
positive c. Can you find what the solutions look like for negative c?
•
y
2.5
12
Fig. 3.
Solutions of y'
=
14
t
-O.2y in Example 3
INamed after the English pioneer in classic economics, THOMAS ROBERT MALTHUS (1766-1834)
6
CHAP. 1 First-Order ODEs
We see that each ODE in these examples has a solution that contains an arbitrary constant
c. Such a solution containing an arbitrary constant c is called a general solution of the
ODE.
(We shall see that c is sometimes not completely arbitrary but must be restricted to
some interval to avoid complex expressions in the solution.)
We shall develop methods that will give general solutions uniquely (perhaps except for
notation). Hence we shall say the general solution of a given ODE (instead of a general
solution).
Geometrically, the general solution of an ODE is a family of infinitely many solution
curves, one for each value of the constant c. If we choose a specific c (e.g., c = 6.45 or
o or -2.01) we obtain what is called a particular solution of the ODE. A particular
solution does not contain any arbitrary constants.
In most cases, general solutions exist, and every solution not containing an arbitrary constant
is obtained as a particular solution by assigning a suitable value to c. Exceptions to these
rules occur but are of minor interest in applications: see Frob. 16 in Problem Set 1.1.
Initial Value Problem
In most cases the unique solution of a given problem, hence a particular solution, is
obtained from a general solution by an initial condition y(xo) = Yo, with given values
Xo and Yo, that is used to determine a value of the arbitrary constant c. Geometrically
this condition means that the solution curve should pass through the point (xo, Yo) in
the .C\:.v-plane. An ODE together with an initial condition is called an initial value
problem. Thus, if the ODE is explicit, y' = f(x, y), the initial value problem is of the
form
(5)
E X AMP L E 4
y' = f(x,
y),
Y(Xo)
=
Yo·
Initial Value Problem
Solve the initial value problem
,
y
=
dy
dx = 3)"
yeO) = 5.7.
The general solution is y(x) = ce 3x ; see Example 3. From this solution and the inttial condition
we obtain yeO) = ceo = c = 5.7. Hence the initial value problem ha~ the solution )'(x) = 5.7e 3x . This is a
particular solution.
•
Solution.
Modeling
The general importance of modeling to the engineer and physicist was emphasized at the
beginning of this section. We shall now consider a basic physical problem that will show
the typical steps of modeling in detail: Step I the transition from the physical situation
(the physical system) to its mathematical formulation (its mathematical model); Step 2
the solution by a mathematical method; and Step 3 the physical interpretation of the result,
This may be the easiest way to obtain a first idea of the nature and purpose of differential
equations and their applications. Realize at the outset that your computer (your CAS) may
perhaps give you a hand in Step 2, but Steps 1 and 3 are basically your work. And Step 2
SEC. 1.1
7
Basic Concepts. Modeling
requires a solid knowledge and good understanding of solution methods available to youyou have to choose the method for your work by hand or by the computer. Keep this in
mind, and always check computer results for enors (which may result, for instance, from
false inputs).
EXAMPLE 5
Radioactivity. Exponential Decay
Given an amount of a radioactive substance, say, 0.5 g (gram), find the amount present at any later time.
Physical Inj"o171wtio11. Experiments show that at each instant a radioactive substance decomposes at a rate
proportIOnal to the the amount present.
Step 1. Setting lip a mathematical model (a differential equation) of the physical process. Denote by yet) the
amount of substance still present at any time t. By the physical law, the time rate of change y' (t) = dyldt is
proporhonal to yet). Denote the constant of proportionality by k. Then
dy
dt
(6)
=
kyo
The value of k is known from experiments for various radioactive substances (e.g.. k = -1.4· lO-llsec -1.
approximately, for radium ssRa226). k is negative because ylt) decreases with time. The given initial amount is
0.5 g. Denote the corresponding time by t = O. Then the initial condition is y(O) = 0.5. This is the instant at
which the process begins; this motivates the term initial condition (which, however, is also used more generally
when the independent variable is not time or when you choose a t other than t = 0). Hence the model of the
process is the initial value problem
(7)
yeO)
dt = ky,
=
0.5.
Step 2. Mathematical soilltion. As in Example 3 we conclude thaI the ODE (6) models exponemial decay and
has the general solution (with arbitrary constant c but definite given k)
(8)
We now use the initial condition to detelwine
particular solution governing this process is
C.
Since yeO) = c from (8), this gives .1'(0) = c = 0.5. Hence the
y(t) = O.Se kt
(9)
(Fig. 4).
Always check YOllr reslllt-it may involve human or computer errors! Verify by differentiation (chain rule!)
that your solution (9) satisfies (7) as well as yeO) = 0.5:
dv
--=-dt
= O.Ske
kt
= k' O.Se kt = ky.
yCO)
=
O.Se o
=
0.5.
Step 3. liltelpretation of reslllt. Formula (9) gives the amount of radioactive substance at tIme I. It starts from
the correct given initial amount and decreases with time because k (the constant of proportionality, depending
on the kind of substance) is negative. The limit of y as t -> x is zero.
•
JI~
o
0.5
I
1.5
2
2.5
3
Fig. 4. Radioactivity (Exponential decay,
y = 0.5 e kt, with k = -1.5 as an example)
CHAP. 1
8
E X AMP L E 6
First-Order ODEs
A Geometric Application
Geometric problems may also lead to initial value problems. For instance, find the curve through the point
(I. I) in the .l,)·-plane having at cach of its points the slope -)1x.
Solution. The slope y' should equal
-)Jx. This gives the ODE y' = -)1x. Its general solution is y = elx
(see Example 1). This is a family of hyperbolas with the coordinate axes as asymptotes.
Now, for the curve to pass through (1, I), we must have y = 1 when x = I. Hence the initial condition is
y(l) = 1. From this condition and y = elx we get yO) = ell = I; that is, c = 1. This gives the particular
•
solution y = lIx (drawn somewhat thicker in Fig. 5).
y
y
Solutions of y'
Fig. 5.
----- •• =
----
=
15. (Existence) (A) Does the ODE y'2
CALCULUS
Solve the ODE by integration.
15-91
2. y'
4. y'
= -sin TTX
x2/2
= xe
(B) Does the ODE
solution?
cosh 4x
VERIFICATION OF SOLUTION
+
5. y'
=
6. y"
+ 7T y = 0,
+ 2,,' + lOy
+ 2y = 4(x +
1
y2,
2
9. y'" = cos x,
110-141
+
y = tan (x
Y
=
=
O.
1)2,
c)
+ b sin 7TX
4e- x sin 3x
Y = 5e- 2x + 2.1'2 + 2T
a cos rrx
Y
y = -sin x
=
+
ax 2
+
bx
+
+ ('
INITIAL VALUE PROBLEMS
Verify that y is a solution of the ODE. Determine from y
the particular solution satisfying the given initial condition.
Sketch or graph this solution.
10. y'
0.5y.
Y = eeO. 5,,:.
y(2) = 2
11. y' = I + 4y2, Y = ~ tan (2x + e), yeO) = 0
12. y' = y - x, y = ce x + x + 1, yeO) = 3
13. y' + 2xy = 0, y = ee-~:2. y( 1) = 1Ie
14. y'
=
y tan x,
= -]
have a (real)
solution?
State the order of the ODE. Verify that the given function
is a solution. (a, 17, e are arbitrary constants.)
7. y"
8. y'
Particular solutions and Singular
solution in Problem 16
u
11-41
1. y'
3. y'
Fig. 6.
-y/x (hyperbolas)
y = c sec x,
yeO)
= ~7T
1/1 + Iyl
=
0 have a general
16. (Singular solution) An ODE may sometimes have an
additional solution that cannot be obtained from the
general solution and is then called a singular solution.
The ODE /2 - XV' + Y = 0 is of the kind. Show by
differentiation and substitution that it has the general
solution y = ex - e 2 and the singular solution y = x 2/4.
Explain Fig. 6.
117-221
MODELING, APPLICATIONS
The following problems will give you a first impression of
modeling. Many more problems on modeling follow
throughout this chapter.
17. (Falling body) If we drop a stone, we can assume air
resistance ("drag") to be negligible. Experiments show
that under that assumption the acceleration y" = d 2Yldt 2
of this motion is constant (equal to the so-called
acceleration of gravity g = 9.80 mlsec 2 = 32 ftlsec 2 ).
State this as an ODE for yet), the distance fallen a~ a
function of time t. Solve the ODE to get the familiar
law of free fall, y = gt 2 /2.
SEC. 1.2
18. (Falling body) If in Prob. 17 the stone starts at t = 0
from initial position Yo with initial velocity u = uo,
show that the solution is y = gt 2 /2 + uot + Yo. Hov.
long does a fall of 100 m take if the body falls from
rest? A fall of 200 m? (Guess first.)
is the half-life of radium 88Ra226 (in years) in
Example 5?
22. (Interest rates) Show by algebra that the investment y(t)
from a deposit Yo after t years at an imerest rate r is
19. (Airplane takeoff) If an airplane has a run of 3 km,
statts with a speed 6 mlsec, moves wIth constant
acceleration, and makes the run in I min, with what
speed does it take off?
20. (Subsonic flight) The efficiency of the engines of
subsonic airplanes depends on air pressure and usually
is maximum near about 36 000 f1. Find the air pressure
rex) at this height without calculation. Ph\'Sical
i'!formGtioll. The ;ate of change y' (x) is propOliionai
to the pressure, and at 18 000 ft the pressure has
decreased to half its value )'0 at sea level.
21. (Half-life) The half-life of a radioactive substance is
the time in which half of the given amount disappears.
Hence it measures the rapidity of the decay. What
1.2
9
Geometric Meaning of y' = f(x, y). Direction Fields
+
+
Ya(t) = yo[1
-"d(t) = .ro[l
r]t
(Interest compounded annually)
(rl365)]365t
(Interest compounded daily).
Recall from calculus that
[1
hent:e [I
+
+ (llll)r - 7
(rln)]nt
--+
e as 11
--+
x;
e't: thu~
(Interest compounded continuously).
What ODE does the last function satisfy? Let the
initial investment be $1000 and r = 6%. Compute the
value of the investment after I year and after 5 years
using each of the three formulas. [s there much
difference?
Geometric Meaning of y'
Direction Fields
t(x, y).
A first-order ODE
(1)
y' = f(x, y)
has a simple geometric interpretation. From calculus you know that the derivative y' (x)
of y(x) is the slope of y(x). Hence a solution curve of (1) that passes through a point
(xo, )'0) must have at that point the slope y' (xo) equal to the value of f at that point; that is,
Read this paragraph again before you go on, and think about it.
It follows that you can indicate directions of solution curves of (I) by drawing short
straight-line segments (lineal elements) in the ,,=,·-plane (as in Fig. 7a) and then fitting
(approximate) solution curves through the direction field (or slope field) thus obtained.
This method is important for two reasons.
1. You need not solve (I). This is essential because many ODEs have complicated
solution formulas or none at all.
2. The method shows, in graphical form, the whole family of solutions and their typical
properties. The accuracy is somewhat limited, but in most cases this does not matter.
Let us illustrate this method for the ODE
(2)
y
, = .X)'.
10
CHAP. 1 First-Order ODEs
Direction Fields by a CAS (Computer Algebra System). A CAS plots lineal elements
at the points of a square grid. as in Fig. 7a for (2), into which you can fit solution curves.
Decrease the mesh size of the grid in regions where I(x, y) varies rapidly.
Direction Fields by Using Isoclines (the Older Method). Graph the curves
I(x, y) = k = const, called isoclines (meaning curves of eqllal inclination). For (2) these
are the hyperbolas I(x, y) = xy = k = const (and the coordinate axes) in Fig. 7b. By (1),
these are the curves along which the derivative y' is constant. These are not yet solution
curves--don't get confused. Along each isocline draw many parallel line elements of the
corresponding slope k. This gives the direction field. into which you can now graph
approximate solution curves.
We mention that for the ODE (2) in Fig. 7 we would not need the method, because we
shall see in the next section that ODEs such as (2) can easily be solved exactly. For the
time being, let us verify by substitution that (2) has the general solution
y(x)
=
(c arbitrary).
ce"2/2
Indeed, by differentiation (chain rule!) we get y' = x(cex2/2 ) = xy. Of course. knowing
the solution, we now have the advantage of obtaining a feel for the accuracy of the
method by comparing with the exact solution. The particular solution in Fig. 7 through
(x, y) = (1,2) must satisfy y(l) = 2. Thus. 2 = ce 1l2 , c = 21Ve = 1.213, and the particular
solution is y(x) = 1.213ex2/2 .
A famous ODE for which we do need direction fields is
(3)
y
(It is related to the van der Pol equation of electronics. which we shall discuss in Sec. 4.5.)
The direction field in Fig. 8 shows lineal elements generated by the computer. We have
also added the isoclines for k = - 5, - 3,~, I as well as three typical solution curves, one
that is (almost) a circle and two spirals approaching it from inside and outside.
y
\
\
\
\
\
\
\(
\
\
\
\
\
,
, 2
""
...... 1
I
/
I
/
I
J
I
I
I
I
J
J
/
/
/
I
J
/
-1
,
/
/
."
I
/
/
!
I
I
/
I
I
I
I
/
I
I
'" '"
/
I
,
/
I
I
I
I
/
/
I
I
/
/
/
I
\
\
\
\
J
I
/
-1. / /
/
y
/
.)
...
/
""
\
x
x
\
"
\
\
\
\
\
\
(aj Bya CAS
(b) By isoclines
Fig. 7.
Direction field of y
,=
xy
SEC. 1.2
Geometric Meaning of y' = f(x, y). Direction Fields
11
y
-4
'-
"
\
\
'- \
\
\
\
\
k =!
4/
\
/
k =-3
k = 1 '---~"7"--'--
/
I
4,
\
\
\
Fig. S.
" '"
\ " '-
x
~ -4~~C/
Direction field of y' = 0.1 (1 - x 2 )
x
-
-
Y
On Numerics
Direction fields gi ve "all" solutions, but with limited accuracy. If we need accurate numeric
values of a solution (or of several solutions) for which we have no formula, we can use
a numeric method. If you want to get an idea of how these methods work, go to Sec.
21.1 and study the first two pages on the Euler-Cauchy method, which is typical of
more accurate methods later in that section, notably of the classical Runge-Kutta method.
It would make little sense to interrupt the present flow of ideas by including such methods
here; indeed, it would be a duplication of the material in Sec. 21.1. For an excursion to
that section you need no exn'a prerequisites; Sec. 1.1 just discussed is sufficient.
11-101
DIRECTION FIELDS, SOLUTION CURVES
Graph a direction field (by a CAS or by hand). In the field
graph approximate solution curves through the given point
or points (x, y) by hand.
1. y' = eX - y. (0. 0), (0. I)
2. 4yy'
3.
y' =
=
-9x, (2, 2)
1
+
y2, (~1T,
I)
111-151
ACCURACY
Direction fields are very useful because you can see
solutions (as many as you want) without solving the ODE,
which may be difficult or impossible in terms of a formula.
To get a feel for the accuracy of the method, graph a field,
sketch solution curves in it, and compare them with the
exact solutions.
Ll.y'
4. y'
=
y - 2y2. (0. 0). (0. 0.25). (0. 0.5). (0. I)
13. y'
5. y'
=
x2
6. y'
=
14. .r'
15. y'
7. V'
=
8. y'
=
-
IIy, (I, -2)
+ siny, (-1, 0), (1,4)
3
y3 + x , (0, 1)
2xy + I, (-I, 2), (0, 0), (1,
I
-2)
9. y' = y tanh x - 2, (-1, -2), (1,0), (1, 2)
10. y'
= eY/x, (I, I), (2, 2), (3, 3)
sin ~1TX
12. y' = I/x2
-2y (SoL y = ce- 2x )
3ylx (Sol. y
=
cx 3 )
-In x
MOTIONS
A body moves on a straight line, with velocity as given.
and yet) is its distance from a fixed point 0 and t time. Find
a model of the motion (an ODE). Graph a direction field.
CHAP. 1
12
First-Order ODEs
In it sketch a solution curve corresponding to the given
initial condition.
16. Velocity equal to the reciprocal ofthe distance, y(l)
=
20. CAS PROJECT. Direction Fields. Discuss direction
fields as follows.
I
(a) Graph a direction field for the ODE y' = I - Y
and in it the solution satisfying yeO) = 5 showing
exponential approach. Can you see the limit of any
solution directly from the ODE? For what initial
condition will the solution be increasing? Constant?
Decreasing?
17. Product of velocity and distance equal to -t, y(3) = -3
18. Velocity plus distance equal to the square of time,
yeO) = 6
19. (Skydiver) Two forces act on a parachutist, the
attraction by the earth mg (/1/ = mass of person plus
equipment. g = 9.8 m/sec 2 the acceleration of gravity)
and the air resistance, assumed to be proportional to
the square of the velocity vet). Using Newton's second
law of motion (mass X acceleration = resultant of the
forces), set up a model (an ODE for v(t». Graph a
direction field (choosing III and the constant of
proportionality equal to 1). Assume that the parachute
opens when v = 10m/sec. Graph the corresponding
solution in the field. What is the limiting velocity?
1.3
(b) What do the solution curves of y' = _X 3/y3 look
like, as concluded from a direction field. How do they
seem to differ from circles? What are the isoclines?
What happens to those curves when you drop the minus
on the right? Do they look similar to familiar curves?
First. guess.
(c) Compare. as best as you can, the old and the
computer methods, their advantages and disadvantages.
Write a short report.
Separable ODEs. Modeling
Many practically useful ODEs can be reduced to the form
(1)
g(y)y' = f(x)
by purely algebraic manipUlations. Then we can integrate on hoth sides with respect to x,
obtaining
(2)
Ig(y)
y'
£Ix
=
If(x) eLr
+ c.
On the left we can switch to y as the variable of integration. By calculus, y' d" = dy. so
that
(3)
I g(y) £I)' = I f(x) £Ix
+ c.
If f and g are continuous functions, the integrals in (3) exist, and by evaluating them we
obtain a general solution of (1). This method of solving ODEs is called the method of
separating variables, and (1) is called a separable equation, because in (3) the variables
are now separated: x appears only on the right and y only on the left.
E X AMP L E 1
A Separable ODE
The ODE y' = 1
dv
--·-2
1
+Y
+ y2 is
= d.1:_
separable because it can be written
By inlegration,
arctany
=x+c
or
y = tan (x
+ c).
SEC. 1.3
13
Separable ODEs. Modeling
It is very impOltallt to illtroduce the COl/stallt 0/ illtegratioll immediately whell the integratioll is perjonlled.
+ c, which
is not a solution (\~hen c
Verify this.
' .
If we wrote arctan ,. = x, then v = tan x. and thell introduced c. we would have obtained ,. = tan x
'* 0):
Modeling
The importance of modeling was emphasized in Sec. 1.1, and separable equations yield
various useful models. Let us discuss this in terms of some typical examples.
E X AMP L E 2
Radiocarbon Dating2
In September 11)1) I the famous Iceman (OetLi). a mummy from the Neolithic period of the Stone Age found in
the ice of the Oetftal Alps (hence the name "Oetzi in Southcrn Tyrolia near the Austrian-Italian border. caused
a scientific sensation. When did Oetzi approximately live and die if the ratio of carbon 6Cl4 to carbon 6Cl2 in
this mummy is 52.5% of that of a living organism?
Physical/II/ormatiol!. In the atmosphere and in living organisms, the ratio of radioactive carbon 6C14 (made
radioactive by cosmic rays) to ordinary carbon 6C12 is con~tant. When an orgamsm dies, its absorption of 6C14
by breathing and eating terminates. Hence one can estimate the age of a fossil by comparing the radioactive carbon
ratio in the fossil with that in the atmosphere. To do this. one needs to know the half-life of 6C14. which is 5715
years (CRC Halldbook a/Chemistry alld Physics, R3rd ed.. Boca Raton: CRC Press. 2002, page II-52. line 9).
OO
)
Solutioll.
Modelillg. Radioactive decay is governed by the ODE y' = ky hee Sec. 1.1. Example 51. By
separation and integration (where t is time and Yo is the initial ratio of ~14 to 6C12)
dy
=
In 13,1 = kt + c.
kdt.
)'
Next we use the half-life H = 5715 to determine k. When t = H. half of the original ~ubstance is still present.
Thus.
In 0.5
0.693
-0.0001213.
k=
5715
H
Finally. we use the ratio 52.5% for determining the time t when OetLi died (actually, was kiUed).
ekt
=
e-0 OOOl213t
= 0.525,
r =
In 0.525
-~---=- =
-0.0001213
5312.
Allswer:
About 5300 years ago.
Other methods show that mdiocarbon dating values are usually too small. According to recent research. this is
due to a variation in that carbon ratio because of industrial pollution and other factors. such as nuclear testing. •
E X AMP L E 3
Mixing Problem
Mixing problems occur quite frequently in chemical industry. We explain here hov. to solve the basic model
involving a single tank. The tank in Fig. I) contains 1000 gal of water in which initially 100 Ib of salt is dissolvcd.
Brine runs in at a rate of 10 gal/min. and each gallon contains 51b of dissoved salt. The mixture in the tank is
kept uniform by ~tirring. Brine ['Uns our at 10 gal/min. Find the amount of salt in the tank at any time t.
Solution.
Step 1. Settillg up a model. Let ,.( r) denote the amount of salt in the tank at time r. Its time rate
of change is
y' = Salt inflow rate - Salt outflow rate
"Balance law".
51b times 10 gal gives an inflow of 50 Ib of salt. Now. the outflow is IO gal of brine. This is 1011000 = 0.01
(= 1%) of the total brine content in the tank, hence 0.01 of the salt content yet), that is. 0.01.1'(0. Thus the model
is the ODE
(4)
y'
= 50 -
O.OJy = -O.Ol(y - 5000).
2Method by WILLARD FRANK UBBY (1908-1980), American chemist, who was awarded for this work
the 1960 Nobel Prize in chemistry.
CHAP. 1
14
First-Order ODEs
Step 2. Sollltioll of the model. The ODE (4) is separable. Separation, integration, and taking exponents on both
sides gives
dy
= -0.01 dl,
Y - 5000
In
L\' -
50001 = -O.Ol t
J' - 5000 = ce- o.Olt .
+ c*,
Initially the tank contains 100 Ib of salt. Hence yeO) = 100 is the initial condition that will give the unique
solution. Substituting), = 100 and t = 0 in the last equation give~ 100 - 5000 = ceo = c. Hence c = -4900.
Hence the amount of salt in the tank at time t is
yet)
(5)
=
5000 - 4900e -o.Olt.
This function shows an exponential approach to the limit 5000 Ib: see Fig. 9. Can you explain physically that
yet) should increase with time? That its limit is 5000 Ib? Can you see the limit directly from the ODE?
The model di~cllssed becomes more realistic in problems on pollutants in lakes (sec Problem Set 1.5. Prob.
27) or drugs in organs. These types of problems are more difficult because the mixing may be imperfect and
the flow rates (in and out) may be different and known only vel)' roughly.
•
Y
-----------_.-=-=--=-=--
5000
4000
3000
2000
1000
100~
o
__ __- L_ _
100 200 300
~
Tank
__~__- L_ _ _
400
500
Salt contenty(t)
Fig. 9.
E X AMP L E 4
~L-
Mixing problem in Example 3
Heating an Office Building (Newton's Law of Cooling})
Suppo,e that in Winter the daytime temperature in a certain office building is maintained at 70°F, The heating
is shUl off at 10 P.M. and tumed on again at 6 A.M. On a certain day the temperature inside the building at
2 A.M. was found to be 65°F. The outside temperature was 50°F at 10 P.M. and had dropped to 40°F by 6 A.M.
What was the temperature inside the building when the heat was turned on at 6 A.M.?
Physical information. Experiments show that the time rate of change of the temperature T of a body B (which
conducts heat well, as, for example, a copper hall does) is proportional to the difference hetween T and the
temperature of the sUlTounding medium (Newton's law of cooling).
SOllitioll. Step L Settillg lip a model. Let T(t) be the temperature inside the building and TA the outside
temperature (assumed to be constant in Newton's law). Then by Newton's law,
(6)
Such experimental laws are derived under idealized assumptions that rarely hold exactly. However. even if a
model seems to fit the reality only poorly (as in the present case). it may still give valuable qualitative information.
To see how good a model is, the engineer will collect experimental data and compare them with calculations
from the modeL
3 Sir ISAAC NEWTON (1642-1727), great English physicist and mathematician. became a professor at
Camblidge in 1669 and Master of the Mint in 1699. He and the German mathematician and philosopher
GOTTFRlED WILHELM LEIBNIZ (1646--1716) invented (independently) the differential and integral calculus.
Newton discovered many basic physical laws and created the method of investigating physical problems by
means of calculus. His Philosophiae namralis principia mathematica (Mathematical Principle), of Natural
Philosophy, 1687) contains the development of classical mechanic~. His work is of greatest importance to both
mathematics and physics.
SEC. 1.3
15
Separable ODEs. Modeling
Step 2. General solution. We cannot solve (6) because we do not know TA , just that it varied between 50°F
and 40°F, so we follow the Goldell Rule: If you cannot solve your problem, try to solve a simpler one. We
solve (6) with the unknown function TA replaced with the average of the two known values, or 45°F. For physical
reasons we may expect that this will give us a reasonable approximate value of T in the building at 6 A.M.
For constant TA = 45 (or any other constant value) the ODE (6) is separable. Separation, integration, and
taking exponents gives the general solution
<IT
T _ 45 = k dt,
In
IT -
451 = kt
T(t) = 45 .,. ce kt
+ c*,
(c = e
C
\
Step 3. PQlticular solutioll. We choose IO P.M. to be t = O. Then the given initial condition is T(Ol = 70 and
yields a particular solution, call it Tp- By substitution.
T(O) = 45
+ ceo
= 70,
c = 70 - 45 = 25,
Step 4. Detel7nillatioll of k. We llse T(4) = 65, where t = 4 is 2
k into Tp(t) gives (Fig.
A.M.
Solving algebraically for k and inserting
10l
e 4k = 0.8,
Step 5. Answer alld illterpretation. 6
A.M.
k =
! In 0.8 =
-0.056,
Tp(t)
= 45
+ 25e -0.056t.
is t = 8 (namely, 8 hours after IO P.M.), and
Tp(8) = 45
+
25e -0.056·8 = 6] rOF].
Hence the temperature in the building dropped 9°F, a result that looks reasonable.
•
y
70
68
66
65
64
-------1'
I
I
I
~i -------~-------_"
~ ~
o
2
4
6
8 t
60 '-------'----'---
Fig. 10.
E X AMP L E 5
Particular solution (temperature) in Example 4
Leaking Tank. Outflow of Water Through a Hole (Torricelli's Law)
This is another prototype engineering problem that leads to an ODE. It concems the outflow of water from a
cylindrical tank with a hole at the bottom (Fig. 11). You are asked to find the height of the water in the tank at
any time if the tank has diameter 2 m, the hole has diameter 1 cm. and the initial height of the water when the
hole is opened is 2.25 m. When will the tank be empty?
Physical information. Under the intluence of gravity the outflowing water has velocity
(7)
vet) =
0.600V2gh(t)
where h(t) is the height of the water above the hole at lime
acceleration of gravity at the surface of the earth.
t,
(TorricelIi's law4 ),
and q = 980 cm/sec2 = 32.17 ft/sec 2 is the
Solutioll.
Step 1. Setti1lg up the model. To get an equation, we relate the decrease in water level h(t) to the
outflow. The volume ti V of the outflow during a short time tit is
tiV=Av!::.t
(A = Area of hole).
4 EV ANGELIST A TORRICELLI (1608-1647), Italian physicist, pupil and successor of GALILEO GALl LEI
(1564-1642) at Florence. The "contraction factor" 0.600 was introduced by 1. C. BORDA in 1766 because the
stream has a smaller cross section than the area of the hole.
16
CHAP. 1
First-Order ODEs
il V must equal the change il V* of the volume of the water in the tank. Now
il.V*
(B = Cross-sectional area of tank)
= -B.lh
where illz (> 0) is the decrease of the height h(t) of the water. The minus sign appears because the volume of
the water in the tank decreases. Equating il. V and il V* gives
-B il.il
=
Au ilt.
We now express u according to Torricelli's law and then let ilt (the length of the time interval considered)
approach O--this is a stalldard way of obtaining an ODE as a model. That is. we have
llh
-
ilt
and by letting .It -+
A
=- -
B
u
A
=- -
B
0.600Y2gh(t)
'
0we obtain the ODE
dh
A
= -2656 - Yh
dt
.
B
'
-
where 26.56 = 0.600 Y2' 980. This is our model, a first-order ODE.
Step 2. General solution. Our ODE is separable. AlB is constant. Separation and integration gives
-
dh
Yh
=
A
-26.56 - dt
B
Dividing by 2 and squaring gives il
yields the general sol ution
= (c -
A
2Yh = c* - 26.56 - t.
B
and
13.28AtIB)2. Inserting 13.28AIB = 13.28' 0.5 2 '/7/1002 '/7
= 0.000332
h(t) = (c - 0.000332t)2.
Step 3. Particular solution. The initial height (the initial condition) is h(O) = 225 cm. Substitution of t = 0
and Iz = 225 gives from the general solution c 2 = 225, c = 15.00 and thus the particular solution (Fig. I))
hp(t) = (15.00 - 0.000332t)2.
Step 4. Tallk empty. hp(t) = 0 if t = 15.0010.000332 = 45 181 [sec] = 12.6 [hOUTS].
Here you see distinctly the importal/ce of the choice of ul/its-we have been working with the Cgs system,
in which time i~ measured in seconds! We used g = 980 cmlsec2 .
•
Step 5. Checking. Check the result.
1:2.00mj
r
h
250
~-f
200
'-
150
2.25 m
h(t)
L
~
• Outflowing
t
water
Tank
Fig. 11.
"
100
50
0
0
10000
30000
50000
t
Water level h(tl in tank
Example 5. Outflow from a cylindrical tank ("leaking tank"). Torricelli's law
Extended Method:
Reduction to Separable Form
Certain nonseparable ODEs can be made separable by transformations that introduce for
y a new unknown function. We discuss this technique for a class of ODEs of practical
SEC. 1.3
17
Separable ODEs. Modeling
importance, namely, for equations
(8)
Here, f is any (differentiable) function of y/x, such as sin (y/x), (Y/X)4, and so on. (Such
an ODE is sometimes called a homogeneous ODE. a term we shall not use but reserve
for a more important purpose in Sec. 1.5.)
The form of such an ODE suggests that we set y/x = u; thus,
(9)
y
= ux
Substitution into),' = fey/x) then gives u' x
this can be separated:
(10)
E X AMP L E 6
y' = u'x
and by product differentiation
+
=
u
du
dx
feu) - u
x
feu) or u' x
=
+
u.
feu) - u. We see that
Reduction to Separable Form
Solve
Solution.
To gel the usual explicit form, divide the given equarion by 2n·.
Now substitute y and y' from (9) and then simplify by subtracting
,
1/
I/X+ 1/="2
,
1/
on both sides.
1/
ux=---2
211
211 '
You see that in the last equation you can now separate the variables,
211 dl/
I + u
2
dx
By integration,
x
Take exponents on both sides to get I
obtai n (Fig. 12)
+
1/
2
In(l
= clx or
I
+
+
2
1/ )
(y/x)2
= -In
= clx.
Ixl + c*
= In
I~ I + c*.
Multiply the last equation by x 2 to
Thus
This general solution represents a family of circles passing through the origin with centers on the x-axis
y
-8
\-4
-.J/'j
8
\~~
Fig. 12.
x
General solution (family of circles) in Example 6
•
CHAP. 1
18
First-Order ODEs
1. (Constant of integl'3tion) An arbitrary constant of
integration must be introduced inunediately when the
integration is performed. Why is this important? Give
an example of your own.
[2-91
GENERAL SOLUTION
Find a general solution. Show the steps of derivation. Check
your answer by substitution.
2. y' + (x + 2»)"2 = 0
3. y'
+
(y
9X)2
+
+ 36x
=
0
6. y' = (4x 2
+
y2)/(xy)
5. )'y'
7.
y' sin
Y cos
TTX =
=
+
)'2
26. (Gompel'tz gl'Owth in tumors) TIle Gompertz model
is r' = -Av In \' (A > 0), where yet) is the mass of
clinical observations. The declining growth rate with
increasing y > 1 corresponds to the fact that cells in
the interior of a tumor may die because of insufficient
oxygen and nutrients. Use the ODE to discuss the
growth and decline of solutions (tumors) and to find
constant solutions. Then solve the ODE.
9\" = v)
7TX
i)"2 + Y
8. xy' =
9. y' e""x
25. (Radiocal'bon dating) If a fossilized tree is claimed to
be 4000 years old. what should be its 6C14 content
expressed as a percent of the ratio of 6C14 to 6C12 in a
living organism?
tu~or cells' at time t. The model agrees well with
2 sec 2y
=
4. y' = {y
of individuals present, what is the popUlation as a
function of time? Figure out the limiting situation for
increasing time and interpret it.
I
27. (Dl-yel') If wet laundry loses half of its moisture
3
during the first 5 minutes of drying in a dryer and if
the rate of loss of moisture is proportional to the
moisture content, when will the laundry be practically
dry, say, when will it have lost 95% of its moisture?
First guess.
11. dr/dt = -2tr, r(O) = ro
28. (Alibi?) Jack, arrested when leaving a bar, claims that
L10-191
INITIAL VALUE PROBLEMS
Find the particular solution. Show the steps of derivation,
beginning with the general solution. (L, R, b are constants.)
10. yy'
+
12. 2xyy'
4x
=
=
3)"2
0, y(O)
+
=
he has been inside for at least half an hour (which
would provide him with an alibi). The police check the
water temperature of his car (parked near the entrance
of the bar) at the instant of arrest and again 30 minutes
later, obtaining the values 190°F and 110°F,
respectively. Do these results give Jack an alibi? (Solve
by inspection.)
x 2 , )"(1) = 2
13. L d/ldt + RI = 0, 1(0) = 10
14. v' = vlx + (2x 3 /v) COS(X2), y(v.;;:t2) = v;IS. >xy"= 2(x + 2i y 3, yeO) = l/Vs = 0.45
16. x.v' = y
+
4x 5 cos 2(y/x), H2)
= 0
17. y'x Inx = y, ."(3) = In 81
18. dr/dO = b[(dr/dO) cos 0 + r sin 0], r(i7T)
0< b < I
y2
19. yy' = (x - l)e- , y(O) = 1
7T.
20. (Particulal' solution) Introduce limits of integration in
(3) such that y obtained from (3) satisfies the initial
condition y(xo) = )'0' Try the formula out on Prob. 19.
=1-36J
APPLICATIONS, MODELING
21. (Curves) Find all curves in the xy-plane whose
tangents all pass through a given point (a, b).
22. (Cm'ves) Show that any (nonverticaD straight line
through the origin of the xy-plane intersects all solution
curves of y' = g()'/x) at the same angle.
23. (Exponential growth) If the growth rate of the amount
of yeast at any time t is proportional to the amount
present at that time and doubles in I week, how much
yeast can be expected after 2 weeks? After 4 weeks?
24. (Population model) If in a population of bacteria the
birth rate and death rate are proportional to the number
29. (Law of cooling) A thermometer, reading 10°C, is
brought into a room whose temperature is 23°C, Two
minutes later the thermometer reading is 18°C, How
long will it take until the reading is practically 23°C,
say, 22.8°C? First guess.
30. (TonicelIi's law) How does the answer in Example 5
(the time when the tank is empty) change if the
diameter of the hole is doubled? First guess.
31. (TolTicelli's law) Show that (7) looks reasonable
inasmuch as V2gh(t) is the speed a body gains if it
falls a distance h (and air resistance is neglected).
32. (Rope) To tie a boat in a harbor. how many times must
a rope be wound around a bollard (a vertical rough
cylindrical post fixed on the ground) so that a man
holding one end of the rope can resist a force exerted
by the boat one thousand times greater than the man
can exert? First guess. Experiments show that the
change /1S of the force S in a small portion of the rope
is proportional to S and to the small angle /1¢ in Fig.
13, Take the proportionality constant 0.15,
SEC. 1.4
Exact ODEs. Integrating Factors
19
Small
portion
of rope
(A) Graph the curves for the seven initial value
x2
problems y' = e- / 2 , yeO) = 0, ± I, ±2, ±3, common
axes. Are these curves congment? Why?
(B) Experiment with approximate curves of nth partial
sums of the Maclaurin series obtained by term wise
integration of that of y in (A); graph them and describe
qualitatively the accuracy for a fixed interval
o ~ x ~ b and increasing n, and then for fixed nand
increasing b.
S+l1S
Fig. 13.
Problem 32
(C) Experiment with
33. (Mixing) A tank contains 800 gal of water in which
200 Ib of salt is dissolved. Two gallons of fresh water
rLms in per minute, and 2 gal of the mixture in the tank.
kept uniform by stirring, runs out per minute. How
much salt is left in the tank after 5 hours?
=
cos (x 2 ) as in (8).
(D) Find an initial value problem with solution
t2
dt and experiment with it as in (8).
o
36. TEAM PROJECT. Tonicelli's Law. Suppose that
the tank in Example 5 is hemispherical, of radius R,
initially full of water, and has an outlet of 5 cm2 crosssectional area at the bottom. (Make a sketch.) Set up
the model for outflow. Indicate what portion of your
work in Example 5 you can use (so that it can become
part of the general method independent of the shape of
the tank). Find the time t to empty the tank (a) for any
R, (b) for R = I m. Plot t as function of R. Find the
time when h = R/2 (a) for any R, (b) for R = 1 m.
y = e·-2 L"e-
34. WRITING PROJECT. Exponential Increase, Decay,
Approach. Collect. order, and present all the information
on the ODE y' = ky and its applications fi'om the text
and the problems. Add examples of your own.
35. CAS EXPERIMENT. Graphing Solutions. A CAS
can usually graph solutions even if they are given by
integrals that cannot be evaluated by the usual methods
of calculus. Show this as follows.
1.4
y'
Exact ODEs. Integrating Factors
We remember from calculus that if a function u(x, y) has continuous partial derivatives,
its differential (also called its total differential) is
du
=
au
-
ax
dx
au
+-
dv.
ay'
From this it follows that if lI(X, y) = C = conST, then du = O.
For example, if u = x + x2.\'3 = c, then
or
y
,
dy
dx
an ODE that we can solve by going backward. This idea leads to a powerful solution
method as follows.
A first-order ODE Mex, y) + N(x, y)y' = 0, written as (use dy = y' d-,;; as in Sec. 1.3)
(1)
M(x, y) dx
+ N(x, y) dy = 0
20
CHAP. 1
First-Order ODEs
is called an exact differential equation if the differential form M(x, y) dx
is exact, that is, this form is the differential
au
au
ax
ay
+ N(x, y) dy
du= -dx+ -dy
(2)
of some function u(x, y). Then (1) can be written
du
=
o.
By integration we immediately obtain the general solution of (I) in the form
(3)
u(x, y)
=
c.
This is called an implicit solution, in contrast with a solution y = hex) as defined in Sec.
1.1, which is also called an explicit solution, for distinction. Sometimes an implicit solution
can be converted to explicit form. (Do this for x 2 + )'2 = 1.) If this is not possible, your
CAS may graph a figure of the contour lines (3) of the function u(x, y) and help you in
understanding the solution.
Comparing (I) and (2), we see that (1) is an exact differential equation if there is some
function u(x, y) such that
(4)
(a)
au
ax
=M,
(b)
au
ay
=N.
From this we can derive a formula for checking whether (1) is exact or not, as follows.
Let M and N be continuous and have continuous first partial derivatives in a region in
the xy-plane whose boundary is a closed curve without self-intersections. Then by partial
differentiation of (4) (see App. 3.2 for notation),
aM
ay
a2 u
ayax '
aN
a2 u
ax
ax ay
By the assumption of continuity the two second partial derivatives are equal. Thus
(5)
aM
aN
ay
ax
This condition is not only necessary but also sufficient for (1) to be an exact differential
equation. (We shall prove this in Sec. 10.2 in another context. Some calculus books (e.g.,
Ref. [GRIll also contain a proof.)
If (I) . is exact, the function u(x, y) can be found by inspection or in the followino
0
systematic way. From (4a) we have by integration with respect to x
SEC. 1.4
21
Exact ODEs. Integrating Factors
(6)
J
= M dx + k(y);
u
in this integration, y is to be regarded as a constant, and k(y) plays the role of a "constant"
of integration. To detennine key), we derive aulay from (6), use (4b) to get dkldy, and
integrate dkldy to get k.
Formula (6) was obtained from (4a). Instead of (4a) we may equally well use (4b).
Then instead of (6) we first have by integration with respect to y
(6*)
u
J
= N dy + lex).
To determine lex), we derive aulax from (6*), use (4a) to get dlldx, and integrate. We
illustrate all this by the following typical examples.
E X AMP L ElAn Exact ODE
Solve
cos (x + v) dx + {3y2 + 2y + cos (x + y» dy = O.
(7)
Solution.
Step 1. Test/or exactness. Our equation is of the form (1) with
M = cos (x + y),
+ y).
N = 3y2 + 2y + cos (x
Thus
aM
-
iiy
=
aN
-.- =
dx
-sm(x
+ v),
-
-sin (x + y).
From this and (5) we see that (7) is exact.
Step 2. Implicit general solution. From (Ii) we obtain by integration
(8)
u
=
f
M dx
+ k(y)
=
f cos
(x
+ y) dx + k(y)
= sin (x +
y)
+
k(y).
To find k(y), we differentiate this fonTIula with respect to y and use fonnula (4b), obtaining
au
ay
= cos (x +
dk
y) + dy
Hence dk/dy = 3y2 + 2y. By integration. k
we obtain the answer
2
= N = 3y +
= l + y2 +
lI(X. y) = sin (x
+
y)
2y + cos (x
+
y).
c*. Inserting this result into (8) and observing 0),
+
y3
+
y2 = c.
Step 3. Checldng all implicit solution. We can check by differentiating the implicit solution u(x, y) = c implicitly
and see whether this leads to the given ODE (7):
all
(9)
This
dlt
complete~
all
= -a
dx + x
ay
the check.
dy
= cos (.l. + -y) dx +
(cos (x + ),) +
3),2
+ 21')
d),
•
= O.
•
CHAP. 1 First-Order ODEs
22
E X AMP L E 2
An Initial Value Problem
Solve the initial value problem
(co~
(10)
Solutioll.
y sinh x
+ I) dx - sin y cosh \" d,'
=
You may verify that the given ODE is exact. We find
II
=-
ylll = 2.
O.
II.
For a change. let us use (6*),
Jsiny cosh x dy + I(x) = cosy cosh x + I(x).
From this. all/ax = cos y sinh x + dlldx = M = cos y sinh x + I. Hence dlldx = 1. By integration,
I(x) = x + c*. This gives the general solution II(X. y) = cos y cosh x + x = c. From the initial condition.
cos 2 cosh I + I = 0.358 = c. Hence the answer is cos y cosh x + x = 0.358. Figure 14 ~how~ the particular
solutions for c = O. 0.358 (thicker curve). 1. 2. 3. Check that the answer satisfies the ODE. (Proceed as in
Example I.) Also check thar the initial condition is satisfied.
•
y
2.5
2.0
0
I II
1.5/
1.0
0.5
t
o
0.5
Fig. 14.
E X AMP L E 3
1.0
1.5
2.0
2.5
3.0
x
Particular solu.ions in Example 2
WARNING! Breakdown in the Case of Nonexactness
The equation -y dx + x dy = 0 is not exact because M = -y and N = x. so that in (5), aMIi)y = -I but
aN/ax = I. Let us show that in such a case the present method does not work. From (6),
1I
=
J
M dx
+ key)
=
-xy + key),
iJl/
hence
ay
=
-x +
dk
{(I'
Now, all/ay should equal N = x, by (4b). However, this is impossible because key) can depend only on y. Try
•
(6*): it will also fail. Sohe the equation by another method that we have discussed.
Reduction to Exact Form. Integrating Factors
The ODE in Example 3 is -y dt + x dy = O. It is not exact. However, if we mUltiply it
by 1Ix2 , we get an exact equation [check exactness by (5)!],
(II)
y
-ydx + xdy
2
= - 2" dx
x
x
1
dy
x
+-
Integration of (11) then gives the general solution )f.t
(y)
=d -
=
x
C
=
= O.
COllst.
SEC. 1.4
23
Exact ODEs. Integrating Factors
This example gives the idea. All we did was multiply a given nonexact equation, say,
(12)
+
P(x, y) dx
Q(x, y) dy = 0,
by a function F that, in general, will be a function of both x and y. The result was an equation
FP dx
(13)
that i1S exact, so we can 1Solve it
an integrating factor of (12).
E X AMP L E 4
a~
+ FQ dy =
0
just discussed. Such a function F(x, y) i1S then called
Integrating Factor
The imegrating factor in (] II is F = L/x 2 . Hence in this case the exact equation (13) is
-v dx +
2
x
FP dx + FQ d)' =
.t
d)'
.
=
d
(
V )
-'-
=
x
o.
y
-
Solution
x
These are straight lines y = ex through the origin.
It is remarkable that we can readily find other mtegrating factors for the equation -y dx
2
1/)'2, lI(xy). and lI(x + )'2), because
(14)
-y,h + xdy _
(~)
2
- d
.
y
y
-y,h+xdy
----'----'- =
x)'
-d (Inx- ) .
y
-ydx+xdy
2
x
+Y
2
=
c.
+ x dy
= O. namely,
Y) .
(
=d arctanx
•
How to Find Integrating Factors
In simpler cases we may find integrating factors by inspection or perhaps after some trials,
keeping (14) in mind. In the general case, the idea is the following.
For M dx + N dy = 0 the exactness condition (4) is aM/a)' = aN/ax. Hence for (13),
FP dx + FQ d)' = 0, the exactness condition is
a
(15)
- . (FP)
ay
=
a
(FQ).
-.
ax
By the product rule, with subscripts denoting partial derivatives, this gives
In the general case, this would be complicated and useless. So we follow the Golden Rule:
cannot solve your problem, try to solve a simpler one-the result may be useful
(and may also help you later on). Hence we look for an integrating factor depending only
on one variable; fortunately, in many practical cases, there are such factors, as we shall
see. Thus, let F = F(x). Then Fy = 0, and Fx = F' = dFld'(, so that (15) becomes
If you
FP y = F'Q
+
FQx.
Dividing by FQ and reshuffling terms, we have
(16)
1 dF
F d,
--=R
'
This proves the following theorem.
where
R
=
1
Q
(ap _ aQ ) .
ay
ax
CHAP. 1
24
THEOREM 1
First-Order ODEs
Integrating Factor F(x)
If (12)
is sllch that the I ight side R of (16), depe1lds o1lly on x. then (12) has an
integrating factor F = F(x), which is obtained by i1lfegrating (16) ([nd taking
exponents on both sides,
(17)
F(x) = exp
I
R(x) dx.
Similarly, if F* = F*(y), then i.nstead of (16) we get
1 dF*
- - =R*
F* dy
,
(18)
where
R*
ap)
I
= p ( ~Q _
iJx
av
and we have the companion
THEOREM 2
Integrating Factor F*(y)
If (12)
is such that the right side R* of (I8) depends only 011 y, then (12) has an
integrating factor F* = F*(y), which is obtained from (18) in the fonn
F*(y)
(19)
E X AMP L E 5
I
= exp R*(y) dy.
Application of Theorems 1 and 2. Initial Value Problem
Using Theorem 1 or 2, find an integrating factor and solve the initial value problem
(e x +y + "eY) dx + (xe Y - 1) dy = 0,
(20)
Solutioll.
yeO) = -1
Step 1. NOllexactlless. The exactnes, check fails:
-oP = -0
oy
ay
(e
x+
Y
+
Y
ye )
= eX+Y + eY + yeY
but
aQ
0
- =ax
ax
(xe Y - I)
= eY .
Step 2. Integratillg factor. General solutio1l. Theorem 1 fails because R [the right side of (16)] depends on
both x and y,
R = -I
Q
(oP
-
oQ) =
-
iJy
1
- - - (e
-
xeY -
ax
I
x+ Y + e Y + yeY
- eY ).
Try Theorem 2. The right side of (18) b
R* - -1
-
P
(a- Q
ax
ap) -
- -
ay
-
-X, - -I - - - (e Y - eX +Y - eY - veY )
e + Y + yeY
•
=
-I.
Hence (19) give, the integrating factor F*(y) = e -Yo From this result and (20) you get the exact equation
(eX
+
y) £Ix
+
(x - e -Y) dy = O.
Test for exactness; you will get I on both sides of the exactne" condition. By imegration, using (4a),
u =
f
(ex
+ y)
£Ix = eX
+ xy +
key).
SEC. 1.4
25
Exact ODEs. Integrating Factors
Differentiate this with respect to), and use (4b) to get
all
-
ily
=
dk
dk
-dy =
x + = N = x - e- Y ,
dy
-e-Y
+ c*.
k = e- Y
,
Hence the general solution is
u(x.
y)
= eX + xy + e-Y = C.
Step 3. Particular solution. The initial condition .1'(0) = I gives lI(O. - I) = 1 + 0 + e = 3.72. Hence the
answer is eX + xy + e -Y = I + e = 3.72. Figure 15 shows several particular solutions obtained as level curves
of u(x, y) = c, obtained by a CAS, a convenient way in cases in which it is impossible or difficult to cast a
solution into explicit form. Note the curve that (nearly) ~atisfies the initial condition.
Step 4. Checking. Check by substitution that the answer satisfies the given equation as well as the initial
condition.
•
y
Fig. 15.
Particular solutions in Example 5
===== --.•........---.--....
-......... - ........ . . ----.~
~--
....:
11-20
I
EXACT ODEs. INTEGRATING FACTORS
Test for exactness. If exact, solve. If not, use an integrating
factor as given or find it by inspection or from the theorems
in the text. Also, if an initial condition is given, determine
the corresponding particular solu[ion.
1. x
3.
3
dx
-71'
4. (e Y
+
sin
-
5. 9x dx
)'3
71'X
dy = 0
sinh)' dx
ye X ) dx
+ 4y dy
+ (xe Y
=
2. (x - y)(dx - dy)
+
-
cos
71'X
=
0
11. - y dx + x dy
=
12. (e x + y
+ (xe x + y +
-
13. -3\" dx
y) dx
+
2x dy = O.
17. (cos
0
wX
+
+
9.
10. - 2xy sin
+
(2y
+
1Ix - xly2) dy
=
0
cos 2x) dx + (lIx - 2 sin 2y) dy
=
0
ylx2) dx
2
(X )
dx + cos (x 2 ) dy = 0
y(l) =
'iT
+
(1
+
1) dy
20. (sin y cos y
+ 11)' (-ylx 2 + 2
x dy) = 0,
0
e-X(-e- Y
7. e- 28 dr - 2re- 28 d()
8. (2x
+
sin
wx)
18. (cos xy + xly) dx
19. e- Y dx
0
F(x. y) = ylx 4
+ eX dy = O. y(Q) =
+ (xly) cos xy) dy =
w
6. eX(cos y dx - sin y dy) = 0
=
1) dy = 0
14. (x 4 + )'2) dx - xv dy = 0, ),(2) = I
15. e 2X (2 cos y dx - sin y dy) = 0, y(O)
16. -sinxy (y dx
cosh y dy = 0
eX) dy = 0
0
+
X
dx
=
0,
1
0
F = e X+ Y
cos 2 y) dx + x dy = 0
21. Under what conditions for the constants A, B. C, D is
(Ax + By) dx + (ex + Dy) dy = 0 exact? Solve
the exact equation.
26
CHAP. 1
First-Order ODEs
(d) [n another graph show the solution curves
satisfying y(O) = ::':::1. ::':::2, ::':::3. ::':::4. Compare the
quality of (c) and (d) and comment.
22. CAS PROJECT. Graphing Particular Solutions
Graph paI1icuiar solutions of the following ODE.
proceeding as explained.
Y cos x dx
(21)
I
+ -
y
(a) Test for exactness. If nece~sary, find an integrating
factor. Find the general solution II(X. y) = c.
(b) Solve (21) by separating variables. Is this simpler
than (a)?
(c) Graph contours II(X, y) = c by your CAS. (Cf. Fig.
16.)
Fig. 16.
1.5
(e) Do the same steps for another nonexact ODE of
your choice.
dy = 0
Particular solutions in CAS Project 22
23. WRITING PROJECT. Working Backward. Start
from solutions u(x, v) = c of your choice. find a
corresponding exact ODE, destroy exactness by a
multiplication or division. This should give you a feel
for the form of ODEs you can reach by the method of
integrating factors. (Working backward is useful in
other areas, too: Euler and other great masters
frequently did it.l
24. TEAM PROJECT. Solution by Several Methods.
Show this as indicated. Compare the amount of work.
(A) eY(sinh x dx + cosh x dy) = 0 as an exact ODE
and by separation.
(B) (I + 2x) cos y dx + dy!cos y = 0 by Theorem
2 and by separation.
(C) (x 2 + y2) eLI: - 2xy dy = 0 by Theorem I or 2
and by separation with v = ylx.
(D) 3x 2 .v dx + 4x 3 dy = 0 by Theorems I and 2
and by separation.
(E) Search the text and the problems for further ODEs
that can be solved by more than one of the methods
discussed so far. Make a list of these ODEs. Find
further cases of your own.
Linear ODEs. Bernoulli Equation.
Population Dynamics
Linear ODEs or ODEs that can be transformed to linear form are models of various
phenomena, for instance, in physics, biology, population dynamics, and ecology, as we
shall see. A first-order ODE is said to be linear if it can be written
(1)
y'
+ p(x)y =
rex).
The defining feature of this equation is that it is linear in both the unknown function y
and its derivative y' = dyJdJC, whereas p and r may be any given functions of x. If in an
application the independent variable is time, we write t instead of x.
If the first term is f(x)y' (instead ofy'), divide the equation by j(x) to get the "standard
form" (I), with y' as the first term. which is practical.
For instance. y' cos x + y sin x = x is a linear ODE, and its standard form is
y' + Y tan x = x secx.
The function rex) on the right may be a force, and the solution y(x) a displacement in
a motion or an electrical cunent or some other physical quantity. In engineering, rex) is
frequently called the input, and y(x) is called the output or the response to the input (and,
if given, to the initial condition).
SEC. 1.5
Linear ODEs.
27
Bernoulli Equation. Population Dynamics
Homogeneous Linear ODE. We want to solve (1) in some interval a < x < b, call it
J, and we begin with the simpler special case that rex) is zero for all x in J. (This is
sometimes written rex) ~ 0.) Then the ODE (l) becomes
)"' + p(x)y = 0
(2)
and is called homogeneous. By separating variables and integrating we then obtain
dy
-
y
=
In 1)'1 = -
thus
-p(x) dx,
f
p(x) dx
+
c*.
Taking exponents on both sides, we obtain the general solution of the homogeneous
ODE (2),
.v(x)
(3)
here we may also choose c
interval.
=
=
ce-Ip(X)
(c = ±ec *
dx
0 and obtain the trivial solution y(x)
=
when
.v ~ 0);
0 for all x in that
Nonhomogeneous Linear ODE. We now solve (I) in the case that rex) in (I) is not
everywhere zero in the interval J considered. Then the ODE (1) is called nonhomogeneous.
It turns out that in this case, (1) has a pleasant property; namely, it has an integrating
factor depending only on x. We can find this factor F(x) by Theorem I in the last section.
For this purpose we write (1) as
+
(py - r) dx
dy
= O.
This is P dx + Q dy = 0, where P = py - rand Q = 1. Hence the right side of (16) in
Sec. 1.4 is simply l(p - 0) = p, so that (16) becomes
1 dF
-
F
-
= p(:r).
dx
Separation and integration gives
dF
=pd'l
and
F
Taking exponents on both sides, we obtain the desired integrating factor F(x),
F(x) = eIP d:x'.
We now multiply (1) on both sides by this F. Then by the product rule,
eIp d:l'(y'
+
py)
=
(eIp d:t: y )'
=
eIp d:t: r .
By integrating the second and third of these three expressions with respect to x we get
e Ip
Dividing this equation by e Ip
(4)
dx
dXy
=
feIP
dX r
dx
+ c.
and denoting the exponent fp dx by h, we obtain
II = fp(x) dx.
28
CHAP. 1
First-Order ODEs
(The constant of integration in h does not matter; see Prob. 2.) Formula (4) is the general
solution of (l) in the form of an integral. Solving (l) is now reduced to the evaluation
of an integral. In cases in which this cannot be done by the usual methods of calculus,
one may have to use a numeric method for integrals (Sec. 19.5) or for the ODE itself
(Sec. 21.1).
The structure of (4) is interesting. The only quantity depending on a given initial
condition is c. Accordingly, writing (4) as a sum of two terms,
y(x) = e- h fehr dx
(4*)
+ ce- h,
we see the following:
(5)
E X AMP LEI
Total Output
= Response to the Input r + Response
to the Initial Data.
First-Order ODE, General Solution
Solve the linear ODE
y' _ )' =
Solutioll.
e 2x
Here,
p = -1.
h=fpdx=-x
and from (4) we obtain the general solution
From (4*) and t5) we see that the response to the input is e 2x .
In simpler cases, such as the present. we may not need the general formula (4). but may wish to proceed
directly. multiplying the given equation by e h = e -x. This gives
Integrating on both sides. we obtain the same result a, before:
ye- X = eX
E X AMP L E 2
+ c.
•
hence
First-Order ODE, Initial Value Problem
Solve the initial value problem
y' + Y tan x
Solutioll.
=
.1'(0) = I.
sin 2x,
Here p = tan x, r = sin 2x = 2 sin x cos x, and
fp dx
= ftan x dx = In Isec xl·
From this we see that in (4),
e h = sec x,
e -h
cos x,
=
ehr = (sec x)(2 sin x cos x) = 2 sin x,
and the general solutIOn of our equation is
y(x) = cos x
(2f sin x
d.1
+
c)
= c
cos x -
2cos x.
2
~rom this and the initial condition, I = c . I - 2· 12; thus c = 3 and the solution of Our initial value problem
2
y = 3 cos x - 2 cos x. Here 3 cos x is the response to the initial data, and -2 cos 2 x is the response to the
input sin 2x.
•
IS
SEC. 1.5
Linear ODEs.
E X AMP L E 3
29
Bernoulli Equation. Population Dynamics
Hormone Level
Assume that rhe level of a certain hormone in the blood of a patient varie, with time. Suppose that rhe time rate
of change is the difference between a sinusoidal input of a 24-hour period from the thyroid gland and a continuous
removal rate proportional to the level present. Set up a model for the hormone level in the blood and find its
general solution. Find the particular solution satisfying a suitable initial condition.
Solutioll.
Step 1. Setting lip a model. Let yet) be the hormone level at time t. Then the removal rate is Ky(t).
The input rate is A + B cos (2 7ft124). where A is the average input rate. and A ~ B to make the input nonnegative.
(The constants A. B. and K can be determined by measurements.) Hence the model is
y' (t)
=
+ B cos (127ft)
[n - Out = A
y' +
or
- Ky(t)
Ky = A + B cos (127ft).
The initial condition for a particular solution Ypart is Ypart(O) = Yo with t = 0 suitably chosen. e.g .. 6:00
Step 2. General solutio". In (4) we have p = K = COllst, h
the general solution
y(t) =
B
A
K
+ 144K2 +
e-
=
Kt. and r
=
~ ( 144K cos
+ B cos (127ft). Hence (4) gives
A
KtJeKt(A + B cos 12
7ft) dt + ce-
A.M.
Kt
7ft
7ft)
Kt
12
+ 127f sin 12 + ce- .
The last term decreases to 0 as t increases, practically after a short time and regardles~ of c (that is. of the initial
condition). The orher part of y(t) is called the stead~'-state solution because it consists of constant and periodic
terms. The entire solution is called the transient-state solution because it models the transition from rest to rhe
steady state. These terms are used quite generally for physical and other systems whose behavior depends on time.
Step 3. Particular soilition. Setting r = 0 in )(t) and choosing Yo = 0, we have
A
),(0) = -K +
B
2
? ·144K + c = 0,
144K + '"
thus
c=
A
K
B
2
2 ·144K.
144K + '"
Inserting this result into y(t). we obtain the particular solution
A
Vpart(t) = -K
+
B
2
144K
2
+ '"
(
7ft
144K cos -2
I
7ft) + 127f sin -2
I
I 44KB
)
A
( -K + 144K2 + 7f2 e
-Kt
with the steady-state part as before. To plot Yp",~ we must specify values for the constants, say, A = B = I and
K = 0.05. Figure 17 shows this solution. Notice that the transition period is relatively short (although K is small),
•
and the curve soon looks sinusoidal; this is the response to the input A + B cos (127ft) = I + cos (f27ft).
y
25
5
Fig. 17.
Particular solution in Example 3
CHAP. 1
30
First-Order ODEs
Reduction to Linear Form. Bernoulli Equation
Numerous applications can be modeled by ODEs that are nonlinear but can be transformed
to linear ODEs. One of the most useful ones of these is the Bernoulli equation5
y'
(6)
If a
+ p(x)y =
g(x)ya
(a any real number).
= 0 or a = I, Equation (6) is linear. Otherwise it is nonlinear. Then we set
U(x)
=
[y(X)]l-a.
We differentiate this and substitute y' from (6). obtaining
Simplification gives
u'
where yl-a
=
=
(l - al(g - pyl-a),
II on the right, so that we get the linear ODE
(7)
tI'
+ (l -
a)pu
=
(1 - a)g.
For further ODEs reducible to linear from, see Ince's classic [A 111 listed in App. I.
See also Team Project 44 in Problem Set 1.5.
E X AMP L E 4
Logistic Equation
Solve thc following Bemoulli equation. known as the logistic equation (or Verhulst equation6 ):
y'
(8)
Solutioll.
to see thaI
{l
Write (8) in the form
=
2, so that
It
=
(6).
/-a =
=
Ay - By2
that is.
y -1. Ditferentiate this
II
and substitute y' from (8).
The last term is _Ay-l = -All. Hence we have obtained the linear ODE
5JAKOB BERNOULLI (1654-1705), Swiss mathematician, professor at Basel. also known for his contribution
to elasticity theory and mathematical probability. TIle method for solving Bernoulli's equation was discovered by
the Leibniz in 1696. Jakob Bernoulli's students included his nephew NJKLACS BERNOULLI (1687-1759). who
contributed to probability theory and infinite series. and his youngest brother JOHANN BERNOULLI (1667-1748).
who had profound influence on the development of calculus. became Jakob's successor at Basel. and had among
his students GABRIEL CRAMER (see Sec. 7.7) and LEONHARD EULER (see Sec. 2.5). His son DANIEL
BERNOULLI (I700-1782) is "nown for his basic work in fluid flow and the kinetic theory of gases.
6PIERRE-FRAN<;:OJS VERHLLST, Belgian statistician, who introduced Eq. (8) a:, a model for human
population growth in 1838.
SEC. 1.5
Linear ODEs.
31
Bernoulli Equation. Population Dynamics
1/'
+ All
11=
ce- At
=
B.
The general solution is [by (4)1
Since
II
= lIy,
this gives the general solution of (8).
(9)
Directly from
+ BfA.
(8)
we see that
y
0=
0
y
= -;; =
(y( t) =
0 for all
t)
is also a solution.
o
Fig. 18.
(Fig. 18).
ce-At + BfA
•
Timet
Logistic population model. Curves (9) in Example 4 with A/B = 4
Population Dynamics
The logistic equation (8) plays an important role in population dynamics, a field that
models the evolution of populations of plants. animals, or humans over time t. If B = 0,
then (8) is y' = dyldt = Ay. In this case its solution (9) is y = (l/c)e At and gives exponential
growth, as for a small population in a large counn)' (the United States in early times!).
This is called Malthus's law. (See also Example 3 in Sec. 1.1.)
The term -By2 in (8) is a "braking term" that prevents the population from growing
without bound. Indeed, if we write y' = Ay[l - (BIA)y]. we see that if y < AlB, then
y' > O. so that an initially small population keeps growing as long as y < AlB. But if
y > AlB. then y' < 0 and the population is decrea-;ing as long as y > AlB. The limit is
the same in both cases, namely, AlB. See Fig. 18.
We see that in the logistic equation (8) the independent variable t does not occur
explicitly. An ODE y' = f(t, y) in which t does not occur explicitly is of the form
(10)
y' == f(y)
and is called an autonomous ODE. Thus the logistic equation (8) is autonomous.
Equation (10) has constant solutions, called equilibrium solutions or equilibrium
points. These are determined by the zeros of fey), because fey) = 0 gives y' = 0 by (10);
hence y = const. These zeros are known as critical points of (10). An equilibIium
solution is called stable if solutions close to it for some t remain close to it for all further
t. It is called unstable if solutions initially close to it do not remain close to it as t
increases. For instance, \" = 0 in Fig. 18 is an unstable equilibrium solution, and \' = 4
is a stable one.
.
32
CHAP. 1
E X AMP L E 5
First-Order ODEs
Stable and Unstable Equilibrium Solutions. "Phase Line Plot"
The ODE y' = (y - l)(y - 2) has the stable equilibrium solution Yl = I and the unstable Y2 = 2, as the
direction field in Fig. 19 suggests. The values )'1 and Y2 are the zeros of the parabola fey) = (y - l)ly - 2)
in the figure. Now, since the ODE is autonomous, we can "condense" the direction field to a "phase line plot"
giving)'1 and Y2, and the direction (upward or downward) of the arrows in the field, and thus giving information
about the stability or instability of the equilibrium solutions.
•
y
y(xl
///////
~O.
/ / / / / / /
2.5' "/ / / / / / / /
/////////
1111111111
'111111111
1I1I1I1,i
11111111'
/ / / / / / / / / '"
~~~~~~~~/
./////////
-~~~~~~~~~
Y2===-=-~--::::--=-W .==-~--=.-:.-:::.
---------~---------------~r------------~-- -----------------------------------------Yl ------_
±oe ._-------
---------
-
~~~~~~~~/
~~~~~~~~~
/ / / / / / / / ~/
/ / / / / / /
"'/////////
'0:5-
/ / / / / / / / /
0
2
I: r ,.
111111111 ',111111111
I f f f f 1 I I 1 ~ ,. ,. ,. ,. t I ,.
-2
-1
(Al
x
x
(el
(Bl
Fig. 19.
Example 5. (A) Direction field. (B) "Phase line". (C) Parabola f(y)
A few further population models will be discussed in the problem set. For some more
details of population dynamics, see C. W. Clark, Mathematical Bioecnnvmics, New York,
Wiley, 1976.
Further important applications of linear ODEs follow in the next section.
1. (CAUTION!) Show that
e-1n(sec x) = cos x.
e-
1n
x =
l/x
(not -x) and
6. x 2 y'
7. y'
2. (Integration constant) Give a reason why in (4) you
may choose the constant of integration in fp dx to be
zero.
+
-t-
3xy = lIx,
ky = e
8. y' + 2y = 4 cos 2x,
9. y'
y(l) = -1
2kx
y(!7T) = 2
6(y - 2.5) tanh 1.5x
+ 4x 2 y = (4x 2 - x)e- x2/ 2
11. )" + 2y sin 2x = 2e cos 2X, yeO)
10. y'
13-171
GENERAL SOLUTION. INITIAL VALUE
PROBLEMS
Find the general solution. If an initial condition is given,
find also the corresponding particular solution and graph or
sketch it. (Show the details of your work.)
3. y' + 3.5y
+
=
2.8
4. y'
=
4y
5. )"
+
1.25y = 5,
x
yeO)
6.6
12. )" tan x = 2y - 8,
0
Y(~7T) = 0
+ 4y cot 2x = 6 cos 2x, y(!7T) = 2
14. y' + )' tan x = e- O. Ob cos x, 1'(0) = 0
15. y' + Y/X2 = 2xe 1/ x , y(l) = 13.86
16. y' cos 2 x + 3y = I, y(!7T) = ~
17. x 3 y' + 3x 2 y = 5 sinh lOx
13. y'
SEC. 1.5
Linear ODEs.
Bernoulli Equation. Population Dynamics
118-241 NONLINEAR ODEs
Using a method of this section or separating variables, find
the general solution. If an initial condition is given, find
also the particular solution and sketch or graph it.
18. y' + Y = y2, yeO) = -I
y' +
= -tan
y.
y(O)
= ~7T
+ I)y = e 'y3, y(O) = 0.5
22. y' sin 2y + x cos 2y = 2x
23. 2yy' + .\'2 sin x = sin x. yeO) = V2
24. y' + x 2y = (e- sinh X)/(3y2)
21.
X
(x
X3
125-361
FURTHER APPLICATIONS
25. (Investment programs) Bill opens a retirement
savings account with an initial amount Yo and then adds
$k to the account at the beginning of every year until
retirement at age 65. Assume that the interest is
compounded continuously at the same rate R over the
years. Set up a model for the balance in the account
and find the general solution as well as the particular
solution, letting t = 0 be the instant when the account
is opened. How much money will Bill have in (he
account at age 65 if he starts at 25 and invests $1000
initially as well as annually, and the interest rate R is
6%? How much should he invest initially and annually
(same amounts) to obtain the same final balance as
before if he starts at age 45? First, guess.
26. (Mixing problem) A tank (as in Fig. 9 in Sec. 1.3)
contains 1000 gal of water in which 200 Ib of salt is
dissolved. 50 gal of brine. each uallon containin u
(I + cos t) Ib of dissolved salt, run: into the tank pe~
minute. The mixture. kept unifonn by stirring. nms out
at the same rate. Find the amount of salt in the tank at
any time t (Fig. 20).
1000
500
200
Fig. 20.
28. (Heating and cooling of a building) Heating and
cooling of a building can be modeled by the ODE
=
k 1 (T - Ta)
+ k 2 (T -
Tw)
+ P,
where T = T(t) is the temperature in the building at
time t, Ta the outside temperature, T w the temperature
wanted in the building. and P the rate of increase of T
due to machines and people in the building, and kl and
~ are (negative) constants. Solve this ODE, assuming
P = canst, T w = canst, and To varying sinusoidally
over 24 hours, say, Ta = A - C cos (27T/2A)t. Discuss
the effect of each tenn of the equation on the solution.
29. (Drug injection) Find and solve the model for drug
injection into the bloodstream if, beginning at t = 0, a
constant amount A g/min is injected and the drug is
simultaneously removed at a rate proportional to the
amount of the drug present at time t.
30. (Epidemics) A model for the spread of contagious
diseases is obtained by assuming that the rate of spread
is proportional to the number of contacts between
infected and noninfected persons, who are assumed to
move freely among each other. Set up the model. Find
the equilibrium solutions and indicate their stability or
instability. Solve the ODE. Find the limit of the
proportion of infected persons as t -+ x and explain
what it means.
31. (Extinction vs. unlimited growth) If in a population
y(t) the death rate is proportional to the population, and
the birth rate is proportional to the chance encounters
of meeting mates for reproduction. what will the model
be? Without solving. find out what will eventually
happen to a small initial population. To a large one.
Then solve the model.
32. (Harvesting renewable resources. Fishing) Suppose
that the population yU) of a certain kind of fish is given
by the logistic equation (8), and fish are caught at a
rate Hy proportional to y. Solve this so-called Schaefer
model. Find the equilibrium solutions Yl and Y2 (> 0)
when H < A. The expression Y = HY2 is called the
equilibrium harvest or sustainable yield corresponding
to H. Why?
y
o
concentration p/4. and the mixture is unifonn (an
assumption that is only very imperfectly true)? First,
guess.
T'
19. y' = 5.7y - 6.5 y 2
20. (x 2 + I)y'
33
50
100
Amount of salt y(t) in the tank in Problem 26
27. (Lake Erie) Lake Erie has a water volume of about
450 km3 and a flow rate (in and out) of about 175 km 3
per year. If at some instant the lake has pollution
c~nc~ntration p = 0.04%, how long, approximately.
wIll It take to decrease it to pl2. assuming that the
inflow is much cleaner, say, it has pollution
33. (Harvesting) In Prob. 32 find and graph the solution
satisfying yeO) = 2 when (for simplicity) A = B = I
and H = 0.2. What is the limit? What does it mean?
What if there were no fishing?
34. (Intermittent harvesting) In Prob. 32 assume that you
fish for 3 years, then fishing is banned for the next 3
years. Thereafter you start again. And so on. This is
called illtermittent /Uln'estillg. Describe qualitatively
how the population will develop if intermitting is
CHAP. 1 First-Order ODEs
34
continued periodically. Find and graph the solution for
the first 9 years, assuming that A = B = I, H = 0.2,
and yeO) = 2.
y
2
1.8
1.6
43. CAS EXPERIMENT. (a) Solve the ODE
y' - ylx = -x- 1 cos (l/x). Find an initial condition
for which the arbitrary constanl is zero. Graph the
resulting particular solution, experimenting to obtain
a good figure near x = O.
(b) Generalizing (a) from 11 = I to arbitrary 11, solve
the ODE y' - nylx = _xn - 2 cos (llx). Find an initial
condition as in (a). and experiment with the graph.
44. TEAM PROJECT. Riccati Equation, Clairaut
Equation. A Riccati equation is of the form
1.4
(1 I)
1.2
y' +
p(x)y = gp:)y 2
+
hex).
A Clairaut equation is of the form
0.8 L - _ - - L_ _" - - _ - - ' -_ _->--_
4
o
2
6
8
8. 21.
Fish population in Problem 34
35. (Harvesting) If a population of mice (in multiples of
1000) follows the logistic law with A = I and B = 0.25.
and if owls catch at a time rate of 10% of the population
present, what is the model, its equilibrium harvest for
that catch. and its solution?
36. (Harvesting) Do you save work in Prob. 34 if you first
transform the ODE to a linear ODE? Do this
transformation. Solve the resulting ODE. Does the
resulting yet) agree with that in Prob. 34?
I
- 7-40
GENERAL PROPERTIES OF LINEAR ODEs
These properties are of practical and theoretical importance
because they enable us to obtain new solutions from given
ones. Thus in modeling, whenever possible, we prefer linear
ODEs over nonlinear ones, which have no similar
properties.
Show that nonhomogeneous linear ODEs (1) and
homogeneous linear ODEs (2) have the following
properties. Illustrate each property by a calculation for two
or three equations of your choice. Give proofs.
37. The sum YI + Y2 of two solutions YI and Y2 of the
homogeneous equation (2) is a solution of (2), and so
is a scalar mUltiple aYI for any constant a. These
properties are not true for (1)1
38. Y = 0 (that is, .v(x) = 0 for all x, also written y(x) "'" 0)
is a solution of (2) [not of (I) if rex) =1= 01], called the
trivial solution.
39. The sum of a solution of (I) and a solution of (2) is a
solution of (1).
40. The difference of two solutions of (l) is a solution of (2).
41. If Yl is a sulution of (I), what can you say about eYl?
42. If YI and Y2 are solutions of y~ + PYI = rl and
Y~ + PY2 = r2, respectively (with the same p!), what
can you say about the sum.vl + Y2?
(12)
y
=
xy' +
g(y').
(a) Apply the transformation y = Y + lilt to the
Riccati equation (1 I ), where Y is a solution of (11), and
oblain for u the linear ODE u' + (2Yg - P)U = -g.
Explain the effect of the transformation by writing it
as y = Y + v, v = lilt.
(b) Show that Y = Y
y' - (2x 3 +
l)y
=
x is a solution of
= _x 2 y2
-
X4 -
X
+
and solve this Riccati equation. showing the details.
(c) Solve y' + (3 - 2X2 sin x)y
= _y2 sin x + 2x + 3x 2 - X4 sin x, using (and
Verifying) that y = x 2 is a solution.
(d) By working "backward" from the L1-equation find
further Riccati equations that have relatively simple
solutions.
(e) Solve the Clairautequationy = xy' + 1/y'.Hillt.
Differentiate this ODE with respect to x.
(f) Solve the Clairaut equation /2 - xy' + Y = 0
in Prob. 16 of Problem Set 1.1.
(g) Show that the Clairaut equation (12) has as
solutions a family of straight lines Y = ex + gee) and
a singular solution determined by g' (s) = -x, where
s = y', that forms the envelope of that family.
45. (Variation of parameter) Another method of
obtaining (4) results from the following idea. Write
(3) as ey*, where y* is the exponential function.
which is a solution of the homogeneous linear ODE
y*' + py* = O. Replace the arbitrary constant e in (3)
with a function II to be determined so that the resulting
function y = IIY* is a solution of the nonhomogeneous
linear ODE y' + PY = r.
46. TEAM PROJECT. Transformations of ODEs. We
have transformed ODEs to separable form, to exact
form, and to linear form. The purpose of such
transfonnations is an extension of solution methods to
larger classes of ODEs. Describe the key idea of each
of these transformations and give three typical
examples of your choice for each transformation,
shOwing each step (not just the transformed ODE).
35
Optional
SEC. 1.6
Orthogonal Trajectories.
1.6
Orthogonal Trajectories.
Optional
An important type of problem in physics or geometry is to find a family of curves that
intersect a given family of curves at right angles. The new curves are called orthogonal
trajectories of the given curves (and conversely). Examples are curves of equal
temperamre (isothel7lls) and curves of heat flow, curves of equal altitude (contour lines)
on a map and curves of steepest descent on that map, curves of equal potential
(equipotential curves, curves of equal voltage-the concentric circles in Fig. 22). and
curves of electric force (the straight radial segments in Fig. 22).
Fig. 22. Equipotential lines and curves of electric force (dashed)
between two concentric (black) circles (cylinders in space)
Here the angle of intersection between two curves is defined to be the angle between
the tangents of the curves at the intersection point. Orthogonal is another word for
pe rpendicular.
In many cases orthogonal trajectories can be found by using ODEs. as follows. Let
(1)
G(x, y, c) = 0
be a given family of curves in the xy-plane, where each curve is specified by some value
of c. This is called a one-parameter family of curves, and c is called the parameter
of the family. For instance, a one-parameter family of quadratic parabolas is given by
(Fig. 23)
or, written as in (1),
G(x, y, c)
= y - cx 2 =
o.
Step 1. Find an ODE for which the given family is a general solution. Of course, this
ODE must no longer contain the parameter c. In our example we solve algebraically for
c and then differentiate and simplify; thus,
hence
y
,
2y
x
CHAP. 1
36
First-Order ODEs
The last of these equations is the ODE of the given family of curves. It is of the form
y' = f(x, y).
(2)
Step 2. Write down the ODE of the orthogonal trajectories. that is. the ODE whose general
solution gives the orthogonal trajectOlies of the given curves. This ODE is
_,
1
y =--f(x, y)
(3)
with the same f as in (2). Why? Well, a given curve passing through a point (xo, Yo) has
slope f(xo, Yo) at that point, by (2). The trajectory through (xo, Yo) has slope -lIf(xo, Yo)
by (3). The product of these slopes is -1, as we see. From calculus it is known that this
is the condition for orthogonality (perpendicularity) of two straight lines (the tangents at
(xo, Yo», hence of the curve and its orthogonal trajectory at (xo, Yo).
Step 3. Solve (3).
For our parabolas y = cx 2 we have y' = 2ylx. Hence their orthogonal trajectories are
obtained from y' = - xl2y or 2yy' + x = O. By integration, y2 + !x 2 = c*. These are
the ellipses in Fig. 23 with semi-axes Vk* and VC*. Here, c* > 0 because c* = 0 gives
just the origin, and c * < 0 gives no real solution at all.
y
Fig. 23.
11-121
Parabolas and orthogonal trajectories (ellipses) in the text
ORTHOGONAL TRAJECTORIES
Sketch or graph some of the given curves. Guess what their
orthogonal trajectories may look like. Find these
trajectories.
(Show the details of your work.)
1. Y = 4x + c
2. y = clx
3. y = ex
4. y2 = 2X2 + c
2
5. x y =
C
6. y = ce- 3x
7. Y = ce x2/2
9. 4x 2 + y2 = e
11. x = ce yl4
113-15/
8. x 2
10.
x
12. x 2
-
y2
= c
cVy
=
+
(y - c)2 = c 2
OTHER FORMS OF THE ODEs (2) AND (3)
13. (y as independent variable) Show that (3) may be
written dx/dy = -f(x, y). Use this form to find the
orthogonal trajectories of y = 2x + ce-x .
SEC. 1.7
37
Existence and Uniqueness of Solutions
=
18. (Electric field) The lines of electric force of two
opposite charges of the same strength at (-1. 0) and
(1, 0) are the circles through (-1. 0) and (l, 0). Show
that these circles are given by x 2 + (y - c)2 = 1 -+ c 2.
Show that the equipotential lines (orthogonal
trajectories of those circles) are the circles given by
lx + C*)2 + )'2 = C*2 - I (dashed in Fig. 25).
14. (Family g(x,y)
c) Show that if a family is given as
g(x, y) = c, then the orthogonal trajectories can be
obtained from the following ODE, and use the latter to
solve Prob. 6 written in the form g(x, y) = c.
dy
ag/ay
dx
ag/iJx
15. (Cauchy-Riemann equations) Show that for a family
u(x, y) = c = const the orthogonal trajectories
vex, y) = c* = const can be obtained from the following
Cauchy-Riemann equations (which are basic in
complex analysis in Chap. 13) and use them to find the
orthogonal trajectories of eX sin y = const. (Here,
subscripts denote partial derivatives.)
116-20 1
APPLICATIONS
16. (Fluid flow) Suppose that the streamlines of the flow
lpaths of the particles of the fluid) in Fig. 24 are
'ltlx, y) = xy = COllst. Find their orthogonal trajectories
(called equipotential lines, for reasons given in Sec.
18.4).
Fig. 25.
Electric field in Problem 18
19. (Temperature field) Let the isotherms (curves of
constant temperature) in a body in the upper half-plane
y > 0 be given by 4X2 + 9)"2 = c. Find the orthogonal
trajectories lthe curves along which heat will flow in
regions filled with heat-conducting material and free
of heat sources or heat sinks).
20. TEAM PROJECT. Conic Sections. (A) State the
main steps of the present method of obtaining orthogonal
trajectorics.
(B) Find conditions under which the orthogonal
trajectories of families of ellipses x 2/a 2 + y2/b 2 = C are
again conic sections. Illustrate your result graphically
by sketches or by using your CAS. What happens if
a~ O?If b~ O?
x
Fig. 24.
Flow in a channel in Problem 16
17. (Electric field) Let the electric equipotential lines
(curves of constant potential) between two concentric
cylinders (Fig. 22) be given by u(x, y) = x 2 + y2 = c.
Use the method in the text to find their orthogonal
trajectories (the curves of electric force).
1.7
(C) Investigate families of hyperbolas
x 2/a 2 - y2/b 2 = c in a similar fashion.
(D) Can you find more complicated curves for which
you get ODEs that you can solve? Give it a try.
Existence and Uniqueness of Solutions
The initial value problem
/,1"/
+ /y/ =
0,
.1'(0)
= 1
has no solution because y = 0 (that is, y(x) = 0 for all x) is the only solution of the ODE.
The initial value problem
y'
== 2x,
.1'(0)
=
I
38
CHAP. 1
First-Order ODEs
has precisely one solution, namely, y = x 2
xy' = y - I,
+
1. The initial value problem
yeO)
=
I
has infinitely many solutions, namely, y = 1 + cx, where c is an arbitrary con<;tant because
= 1 for all c.
From these examples we see that an initial value problem
y(O)
(1)
y'
= f(x, y),
may have no solution, precisely one solution, or more than one solution. This fact leads
to the following two fundamental questions.
Problem of Existence
Under what conditions does an initial mlue problem
nne solutinn (hence one or several solutions)?
~f the
form (I) have at least
Problem of Uniqueness
Under what conditiollS dnes that problem have at 17l0H one solution (hence excluding
the case that is has more than one solution)?
Theorems that state such conditions are called existence theorems and uniqueness
theorems, respectively.
Of course, for our simple examples we need no theorems because we can solve these
examples by inspection; however, for complicated ODEs such theorems may be of
considerable practical importance. Even when you are sure that your physical or other
system behaves uniquely, occasionally your model may be oversimplified and may not
give a faithful picture of the reality.
THEOREM 1
Existence Theorem
Let the right side f(x, y) of the ODE ill the initial value problem
y'
(1)
= f(x, y),
y(xo)
= Yo
be continuous at all points (x. y) in some rectangle
R:
Ix -
xol < a.
IY - Yol <
b
(Fig. 26)
alld bounded ill R; that is, there is a number K sllch that
(2)
If(x, y)1 ~ K
for all (x, y) ill R.
Then the initial value problem (1) has at least one solution y(x). This solution exists
at least for all x in the subinterl'al Ix - xol < a of the illfervallx - xol < a; here,
a is the smaller ~f the two numbers (/ and b/K.
SEC 1.7
39
Existence and Uniqueness of Solutions
Y
-----<j'
Yo
R
x
Fig. 26.
Rectangle R in the existence and uniqueness theorems
(Example of BOllndedlless. The function f(x, .1') = x 2 + y2 is bounded (with K = 2) in the
square Ixl < I, Iyl < 1. The function f(x, y) = tan (x + y) is not bounded for Ix + yl < 7T/2.
Explain!)
THEOREM 2
Uniqueness Theorem
Let f alld its partial derivative fy = rJf/rJy be collfinllOllS for aI/ (x, y) ill the
rectallgle R (Fig. 26) alld bOllnded, say,
(3)
(a)
If(x,
y)1
~ K,
for all (x, y) ill R.
Theil the illitial mille problem (1) has at most aile solutioll y(xJ. TIllis. by TIleorem 1,
the problem has precisely aile soilltioll. This solution exists at leastfor aI/ x in that
subinterval Ix - xol < a.
Understanding These Theorems
These two theorems take care of almost all practical cases. Theorem 1 says that if f(x, y)
is continuous in some region in the xy-plane containing the point (xo, Yo). then the initial
value problem (I) hali at least one solution.
Theorem 2 says that if, moreover, the partial derivative aflay of f with respect to y
exists and is continuous in that region, then (I) can have at most one solution; hence, by
Theorem I, it has precisely one solution.
Read again what you have just read-these are entirely new ideas in our discussion.
Proofs of these theorems are beyond the level of this book (see Ref. [A II] in App. I);
however, the following remarks and examples may help you to a good understanding of
the theorems.
~ K; that is, the slope of any
Since / = f(x, y), the condition (2) implies that
solution curve y(x) in R is at least - K and at most K. Hence a solution curve that passes
through the point (xo, Yo) must lie in the colored region in Fig. 27 on the next page bounded
by the lines 11 and 12 whose slopes are -K and K, respectively. Depending on the form
of R, two different cases may arise. In the first case, shown in Fig. 27a, we have blK ~
a and therefore a = a in the existence theorem, which then asserts that the solution exists
for all x between Xo - a and Xo + ll. In the second case, shown in Fig. 27b, we have
blK < ll. Therefore, ll' = blK < a, and all we can conclude from the theorems is that the
solution exists for all x between Xo - blK and Xo + blK. For larger or smaller x's the
solution curve may leave the rectangle R, and since nothing is assumed about f outside
R, nothing can be concluded about the solution for those larger or smaller x's: that is, for
<;uch x's the solution mayor may not exist-we don't know.
1/1
First-Order ODEs
CHAP. 1
40
y
Yo + b
Yo
yo-b
x
Xo
(b)
(a)
Fig. 27.
The condition (2) of the existence theorem. (a) First case. (b) Second case
Let us illustrate our discussion with a simple example. We shall see that our choice of
a rectangle R with a large base (a long x-interval) will lead to the ca<;e in Fig. 27b.
E X AMP L E 1
Choice of a Rectangle
Consider the initial valne problem
"(0) = 0
and take the rectangle R;
I.\i <
5,
1)'1 <
3. Then a
I I
ilf
-:(Iv
a =
=
5, b
=
=
21yl
~ M = 6.
b
K=
3. and
0.3 < a.
Indeed. the solntion of the problem is y = tan x (see Sec. 1.3, Example I). This solntion is discontinuous at
± 70/2, and there is no callTilllwl/s solution valid in (he entire imerval Ixl < 5 from which we starred.
•
The conditions in the two theorems are sufficient conditions rather than necessary ones, and
can be lessened. [n particular. by the mean value theorem of differential calculus we have
where (x, Yl) and (x, )"2) are assumed to be in R, and:V is a suitable value between)"l and
Y2' From this and (3b) it follows that
(4)
It can be shown that (3b) may be replaced by the weaker condition (4), which is known
as a Lipschitz condition.7 However, continuity of f(x, y) is not enough to guarantee the
uniquelless of the solution. This may be illustrated by the following example.
SEC. 1.7
41
Existence and Uniqueness of Solutions
E X AMP L E 2
Nonuniqueness
The initial value problem
y'
ViYj,
=
yeO)
=
0
has the two solutions
y=o
v*
and
•
=
{
x2/4
if x ~ 0
2
if x < 0
-x 14
although f(x, ,1') = Vlyl is continuous for all y. The Lipschitz condition (4) is violated in any region that includes
the line y = 0, because for h = 0 and positive Y2 we have
(5)
II(x,
)'2) -
I(x, Yl)1
IY2 - hi
I
(~>Ol
vy; .
and this can be made as large as we plea,e by choosing ,1'2 sufficiently ,mall. whereas (4) requires that the
quotient on the left side of (5) should not exceed a fixed constant M.
•
_ _._-.= _
--. . . . =..._
u .•.
.- ... -.............
1. (Vertical strip) If the a~sumptions of Theorems I and 2
are satisfied not merely in a rectangle but in a vertical
infinite strip Ix - xol < a, in what interval will the
solution of (1) exist?
2. (Existence?) Does the initial value problem
(x - l)y' = 2y, y(l) = I have a solution? Does your
result contradict our present theorems?
3. (Common points) Can two solution curves of the same
ODE have a common point in a rectangle in which the
assumptions of the present theorems are satisfied?
4. (Change of initial condition) What happens in Prob. 2
if you replace yO) = 1 with yO) = k?
S. (Linear ODE) If p and I' in y' + p(x)y = rex) are
continuous for all x in an interval Ix - xol ~ a, show
that f(x, y) in this ODE satisfies the conditions of our
present theorems, so that a cOlTesponding initial value
problem has a unique solution. Do you actually need
these theorems for this ODE?
6. (Three possible cases) Find all initial conditions such
that lx 2 - 4x)y' = (2x - 4)y has no solution, precisely
one solution, and more than one solution.
7. (Length of x-interval) In most cases the solution of an
initial value problem (1) exists in an x-interval larger
than that guaranteed by the present theorems. Show this
fact for y' = 2y2, yO) = I by finding the best possible
a (choosing b optimally) and comparing the result with
the actual solution.
8. PROJECT. Lipschitz Condition. (A) State the
definItion of a Lipschitz condition. Explain its relation
to the existence of a partial derivative. Explain its
significance in our present context. Illustrate your
statements by examples of your own.
(B) Show that for a lillear ODE y' + p(:.:)y = rex) with
continuous p and r in Ix - xol ~ a a Lipschitz condition
holds. This is remarkable because it means that for a
linear ODE the continuity of f(x, y) guarantees not only
the existence but also the uniqueness of the solution of
an initial value problem. (Of course, this also follows
directly from (4) in Sec. 1.5.)
(C) Discuss the uniqueness of solution for a few simple
ODEs that you can solve by one of the methods
considered, and find whether a Lipschitz condition is
satisfied.
9. (Maximum a) What
Example I in the text?
IS
the largest possible a
In
10. CAS PROJECT. Picard Iteration. (A) Show that by
integrating the ODE in (I) and observing the initial
condition you obtain
(6)
y(x) = Yo
+
f
f(t. \'(t» dt.
Xo
7RUDOLF LIPSCHITZ (1832-1903), Gennan mathematician. Lipschitz and similar conditions are important
in modern theories, for instance, in partial differential equations.
CHAP. 1 First-Order ODEs
42
(B) Apply the iteration to y' = x + y, yeO) = O. Also
solve the problem exactly.
(0 Apply the iteration to y' = 2y2. yeO) = 1. Also
solve the problem exactly.
This fonn (6) of (I) suggests Picard's iteration
methodS, which is defined by
(7)
y,,(x) = )'0
+
f'
fU. y,.-IU)) dt.
n = 1. 2.....
Xo
2VY.
=
y( I) = O. Which
of them does Picard's iteration approximate?
(D) Find all solutions of)"
It gives approximations )'1 • .'"2. )'3' . . . of the unknown
solution), of (I). Indeed, you obtain ."1 by substituting
y = )'0 on the right and integrating-this is the first
step-, then Y2 by substituting y = YI on the right and
integrating-this is the second step-. and so on. Write
a program of the iteration that gives a printout of the
first approximations Yo. )'1' ...• YN as well as their
graphs on common axes. Try your program on two
initial value problems of your own choice.
..
(E) Experiment with the conjecture that Picard's
iteration converges to the solution of the problem for
any initial choice of y in the integrand in (7) (leaving
)'0 outside the integral as it is). Begin with a simple
ODE and see what happens. When you are reasonably
sure. take a slightly more complicated ODE and give
it a try.
Ew=.Q U £5 T ION SAN D PRO B L EMS
,
14. y'
1. Explain the tenns ordinary d~fferellfial equatiol/ (ODE).
partial d~fferellfial equation (PDE). order. gel/eral
solution. and particular solutioll. Give examples. Why
are these concepts of imp0l1ance?
2. What is an initial condition? How is this condition used
in an initial value problem?
16xly
13. Y
GENERAL SOLUTION
, 15-261
Find the general solution. Indicate which method in this
chapter you are using. Show the details of your work.
15. y' = x 2 (1 + .1'2)
x 2 + I)
17. yy' + xy2 = x
IS. -7T sin TTX cosh 3y dx
3. What is a homogeneous linear ODE? A nonhomogeneous
linear ODE? Why are these equations simpler than
nonlinear ODEs?
16.
y'
4. What do you know about direction fields and their
practical importance?
5. Give examples of mechanical problems that lead to ODEs.
19.
y' +
6. Why do electric circuits lead
to
ODEs?
7. Make a list of the solution methods considered. Explain
each method with a few short sentences and illustrate
it by a typical example.
S. Can certain ODEs be solved by more than one method?
Give three examples.
9. What are integrating factors? Explain the idea. Give
examples.
10. Does every first-order ODE have a solution? A general
solution? What do you know about uniqueness of
solutions?
111-141 DIRECTION FIELDS
Graph a direction field (by a CAS or by hand) and sketch
some of the solution curves. Solve the ODE exactly and
compare.
11. ,.' = 1 + 4)'2
12. y'
3y - 2~
= x(y -
+
21. 3 sin 2y dx
22.
x/
3 cos TTX sinh 3y dy = 0
20. y' - y = 1Iy
2x cos 2y dy = 0
x tan (ylx) + Y
=
23. (y cos xy - 2x) dx
=
24. xy'
+
Y sin x = sin x
(y -
2X)2
25. sin (y - x) dx
26. xy' = (ylx)3
+
+)'
lx cos x)' + 2y) dy = 0
(Set.\· - 2x = z.)
+ [cos (y
+y
- x) - sin (y - x)] dy = 0
127-321 INITIAL VALUE PROBLEMS
Solve the following initial value problems. Indicate the
method used. Show the details of your work.
27. yy' + x = O. y(3) = 4
2S.y'
3y=-12 y 2. y(O)=2
29.
.v'
30. y'
+
=
I
+
7T)' =
)'2,
+
y 21x
yeO)
=
0
+ (2 + 2X2)') dy = o. yeO) =
+ eX(l + llx)] dx + lx + 2y) dy =
31. (2 xy 2 - sin
32. [2y
Y(~7T) = 0
2b cos 7TX,
x)
dx
I
0,
y(l) = I
~MILE PICARD (1856-1941). French mathematician. also known for his important contributions to complex
analysis (see Sec. 16.2 for his famous theorem). Picard u~ed his method to prove Theorems I and 2 as well as
the convergence ofthe sequence (7) to the solution of (I). In precomputer times the iteration was oflittle practical
value because of the integrations.
43
Summary of Chapter 1
133-431
APPLICATIONS, MODELING
33. ~Heat flow) If the isothelms in a region are x 2 - )'2 = c,
what are the curves of heat flow (assuming orthogonality)?
34. (Law of cooling) A thennometer showing WaC is
brought into a room whose temperature is 25°C. After
5 minutes it shows 20°C. When will the thennometer
practically reach the room temperature, say, 24.9°C?
35. (Half-life) If 10o/c of a radioactive substance disintegrates
in 4 days, what is it~ half-life?
36. (HaIf-life) What is the half-life of a substance if after
5 days, 0.020 g is present and after 10 days, 0.015 g?
37. (HaIf-life) When will 99% of the substance in Prob. 35
have disintegrated?
38. (Air circulation) In a room containing 20000 ft3 of
air, 600 ft3 of fresh air tlows in per minute, and the
mixture (made practically unifonn by circulating fans)
is exhausted at a rate of 600 cubic feet per minute
(cfm). What is the amount of fresh air Yl1) at any time
if yeO) = 0'1 After what time will 90% of the air be
fresh?
39. (Electric field) If the equipotential lines in a region of
the x)"-plane are 4x 2 + y2 = c, what are the curves of
the electIical force? Sketch both families of curves.
40. (Chemistry) In a bimolecular reaction A + B - ? M,
a moles per liter of a substance A and b moles per liter
of a substance B are combined. Under constant
temperature the rate of reaction is
y'
=
(Law of mass action);
k(a - y)(b - .1')
that is, y' is proportional to the product of the
concentrations of the substances that are reacting. where
)'(t) is the number of moles per liter which have reacted
after time t. Solve this ODE, assuming that a *- b.
41. (Population) Find the population y(1) if the birth rate is
proportional to y(r) and the death rate is proportional to
the square of y( f).
42. (Curves) Find all curve~ in the first quadrant of the Ayplane such that for every tangent. the segment between
the coordinate axes is bisected by the point of tangency.
(Make a sketch.)
43. (Optics) Lambert's law of absorption9 states that the
absorption of light in a thin transparent layer is
proportional to the thickness of the layer and to the
amount of light incident on that layer. Formulate this
law as an ODE and solve it.
--- .......... ...... -.-.- . _._ ......---
......
-
. . . . . . . ".'11..-.
•
_
......... _
-
Fi rst-Order ODEs
This chapter concerns ordinary differential equations (ODEs) of first order and
their applications. These are equations of the form
F(x, y, /)
= 0
or in explicit form
y'
=
I(x, y)
involving the derivative y' = dy/dx of an unknown function y, given functions of
x, and, perhaps, y itself. If the independent variable x is time, we denote it by t.
In Sec. 1.1 we explained the basic concepts and the process of modeling, that is,
of expressing a physical or other problem in some mathematical form and solving
it. Then we discussed the method of direction fields (Sec. 1.2), solution methods
and models (Secs. 1.3-1.6), and, finally, ideas on existence and uniqueness of
solutions (Sec. 1.7).
9]OHANN HEINRICH LAMBERT (1728-1777), German physicist and mathematician.
44
CHAP. 1
First-Order ODEs
A first-order ODE usually has a general solution, that is. a solution involving an
arbitrary constant, which we denote by c. In applications we usually have to find a
unique solution by determining a value of c from an initial condition y(xo) = Yo.
Together with the ODE this is called an initial value problem
y' =
(2)
f(x, y),
y(xo,)
=
Yo
(xo, Yo given numbers)
and its solution is a particular solution of the ODE. Geometrically. a general
solution represents a family of curves, which can be graphed by using direction
fields (Sec. L.2). And each particular ~ulution corresponds to one of these curves.
A separable ODE is one that we can put into the form
(3)
g(y) dy
=
f(x) tit
(Sec. 1.3)
by algebraic manipulations (possibly combined with transformations, such as ylx
and solve by integrating on both sides.
An exact ODE is of the form
(4)
where M dx
M(x. y) dx
+
+ N(x. y) dy = 0
= u)
(Sec. 1.4)
N dy is the differential
du =
Ux
dx
+ uy
dy
of a function u(x, .v), so that from du = 0 we immediately get the implicit general
solution u(x, y) = c. This method extends to nonexact ODEs that can be made exact
by mUltiplying them by some function F(x, y), called an integrating factor (Sec. 1.4).
Linear ODEs
(5)
y' + p(x)y =
rex)
are very important. Their solutions are given by the integral formula (4), Sec. 1.5.
Certain nonlinear ODEs can be transformed to linear form in terms of new variables.
This holds for the Bernoulli equation
y' +
p(x)y = g(x)yU
(Sec. 1.5).
Applications and modeling are discussed throughout the chapter. in particular in
Secs. 1.1. 1.3. 1.5 (population dynamics, etc.). and 1.6 (trajectories).
Picard's existence and uniqueness theorems are explained in Sec. 1.7 (and
Picard's iterati()l1 in Problem Set l.7).
Numeric methods for first-order ODEs can be studied in Secs. 21.1 and 21.2
immediately after this chapter, as indicated in the chapter opening.
.....
2
CHAPTER
•
•
-~~
I
I
:..
Second-Order Linear ODEs
Ordinary differential equations (ODEs) may be divided into two large classes, linear
ODEs and nonlinear ODEs. Whereas nonlinear ODEs of second (and higher) order
generally are difficult to solve, linear ODEs are much simpler because various properties
of their solutions can be characterized in a general way, and there are standard methods
for solving many of these equations.
Linear ODEs of the second order are the most important ones because of their
applications in mechanical and electrical engineering (Secs. 2.4. 2.8. 2.9). And their theory
is typical of that of all linear ODEs, but the formulas are simpler than for higher order
equations. Also the transition to higher order (in Chap. 3) will be almost immediate.
This chapter includes the derivation of general and particular solutions, the latter in
connection with initial value problems.
(Boundary value problems follow in Chap. 5, which also contains solution methods for
Legendre's, Bessel's, and the hypergeometric equations.)
COM M ENT. NllInerics for second-order ODEs can be studied immediately after this
chapter. See Sec. 21.3, which is independent of other sections in Chaps. 19-21.
Prerequisite: Chap. 1, in particular. Sec. IS
Sections that may be omitted in a shorter course: 2.3. 2.9, 2.10.
References and Answers to Problems: App. 1 Part A, and App. 2.
2.1
Homogeneous Linear ODEs of Second Order
We have already considered first-order linear ODEs (Sec. 1.5) and shall now define and
discuss linear ODEs of second order. These equations have important engineering
applications, especiaUy in connection with mechanical and electrical vibrations (Secs. 2.4,
2.8, 2.9) as well as in wave motion, heat conduction, and other parts of physics, as we
shall see in Chap. 12.
A second-order ODE is called linear if it can be wriuen
(1)
y"
+ p(x)y' + q(x)y = r(x)
and nonlinear if it cannot be written in this form.
The distinctive feature of this equation is that it is linear in y alld its derivatives, whereas
the functions p, q, and r on the right may be any given functions of x. If the equation
begins with, say, f(x)y", then divide by f(x) to have the standard form (1) with y" as
the first term, which is practical.
45
CHAP. 2
46
Second-Order Linear ODEs
If rex) == 0 (that is, rex)
reduces to
=
0 for all x considered; read "r(x) is identically zero"), then
( I)
(2)
y"
+ p(x)y' +
q(x)y
=
0
and is called homogeneous. If rex) =/= 0, then (1) is called nonhomogeneous. This is
similar to Sec. 1.5.
For instance, a nonhomogeneous linear ODE is
y"
+ 25y = e- x cos x,
and a homogeneous linear ODE is
xy"
+ y' +
A}' =
0,
.v" + -x'v' + Y = O.
in standard form
An example of a nonlinear ODE is
y"y
+
/2
=
O.
The functions p and q in (l) and (2) are called the coefficients of the ODEs.
Solutions are defined similarly as for first-order ODEs in Chap. 1. A function
)' =
Iz(x)
is called a solution of a (linear or nonlinear) second-order ODE on some open interval I
if h is defined and twice differentiable throughout that interval and is such that the ODE
becomes an identity if we replace the unknown y by 11, the derivative y' by Iz', and the
second derivative y" by It. Examples are given below.
Homogeneous Linear ODEs: Superposition Principle
Sections 2.1-2.6 will be devoted to homogeneous linear ODEs (2) and the remaining
sections of the chapter to nonhomogeneous linear ODEs.
Linear ODEs have a rich solution structure. For the homogeneous equation the backbone
of this structure is the SUPCI7Jositio11 principle or lillearit), principle, which says that we
can obtain further solutions from given ones by adding them or by multiplying them with
any constants. Of course, this is a great advantage of homogeneous linear ODEs. Let us
first discuss an example.
E X AMP L E 1
Homogeneous Linear ODEs: Superposition of Solutions
The functions y = cos x and y = sin x are solutions of the homogeneous linear ODE
y" + Y
=
0
for all x. We verify this by differentiation and ,ubstitution. We obtain (cos r)" = -cos x; hence
y"
+Y
=
(cos.r)"
+ cosx
=
-cos ..
+ cos x
=
O.
Similarly for y = sin x (verify!). We can go an impollant step fUrlher. We multiply cos x by any constant. for
instance. 4.7. and sin.\" by. say. -2. and take the sum of the results. claiming thm it is a solution. Indeed.
differentiation and substitution gives
(4.7 cos x - 2 sin r)"
+ (4.7 cos X
-
2 sin x) = -4.7 cos X
+ 2 sin X + 4.7 cos I
-
2 sinx = D.
•
SEC. 2.1
47
Homogeneous Linear ODEs of Second Order
In this example we have obtained from)'1 (= cos x) and)'2 (= sin x) a function of the fonn
(CI, C2 arbitrary constants).
(3)
This is called a linear combination of YI and .1'2' In terms of this concept we can now
formulate the result suggested by our example. often called the superposition principle
or linearity principle.
THEOREM 1
Fundamental Theorem for the Homogeneous Linear ODE (2)
For a homogeneous linear ODE (2), an)' linear combination of two solutions on an
open interl'Ol I is again a solution of (2) 011 I. In pm1icular, for sucb an equation.
sums and cOllSta1l117luitipies of solutions are again solutions.
PROOF
Let YI and Y2 be solutions of (2) on I. Then by substituting Y = CI)'I + C2Y2 and its
derivatives into (2), and using the familiar rule (Cd'i + (2)'2)' = CIY~ + C2Y~' etc., we
get
)''' + py' +
q)'
= (CIYI +
C2Y2)"
+ P(CIYI + C2Y2)' + q(CI)'I +
= CI)'~ + C2."~ + P(CIY~ + C2Y~) +
= CI()'~ + py~ +
q(CI)'I
+
C2Y2)
+ C2()'~ + PY~ + qY2) =
q)'I)
C2Y2)
0,
since in the last line, ( ... ) = 0 because )'1 and Y2 are solutions, by assumption. This
that Y is a solution of (2) on I.
show~
•
CAUTION! Don't forget that this highly important theorem hold~ for homogeneo/ls
linear ODEs only but does not hold for nonhomogeneous linear or nonlinear ODEs, as
the following t~o examples illustrate.
E X AMP L E 2
A Nonhomogeneous Linear ODE
Verify by substitution that the functions y
linear ODE
~
I
+ cos t and y
y"
but their sum
E X AMP L E 3
j,
+Y
=
=
I
+ sin \. are solutions of the nonhomogeneou,
I.
not a solution. Neither is, for instance, 2( I + cos x) Or 5(1 + sin x).
•
A Nonlinear ODE
Verify by sub~titution that the runctions y
= x 2 and y = I are solutions of the nonlinear ODE
y")" - xy' = O.
but their sum is not a solution. Neither is -x2 , so you cannot even mUltiply by -I!
•
Initial Value Problem. Basis. General Solution
Recall from Chap. I that for a first-order ODE, an initial value problem consists of the
ODE and one initial condition y(xo) = Yo. The initial condition is used to determine the
arbitrary constant c in the general solution of the ODE. This results in a unique solution,
as we need it in most applications. That solution is called a particular solution of the
ODE. These ideas extend to second-order equations as follows.
CHAP. 2
48
Second-Order Linear ODEs
For a second-order homogeneous linear ODE (2) an initial value problem consists of
(2) and two initial conditions
(4)
These conditions prescribe given values Ko and Kl of the solution and its first derivative
(the slope of its curve) at the same given x = Xo in the open interval considered.
The conditions (4) are used to determine the two arbitrary constants CI and C2 in a
general solution
(5)
of the ODE; here. )'1 and )'2 are suitable solutions of the ODE, with "suitable" to be
explained after the next example. This results in a unique solution, passing through the
point (xo, Ko) with KI as the tangent direction (the slope) at that point. That solution is
called a particular solution of the ODE (2).
E X AMP L E 4
Initial Value Problem
Solve the initial value problem
y"
+ Y = 0,
,,(0)
=
y' (0)
3.0,
=
-0.5.
Solution.
Step 1. General so/Iltioll. The Functions cos x and sin x are solutions of the ODE (by Example
I), and we take
y = cl cos x
+
c2 sinx.
This will turn out to be a general solution as defined below.
Step 2. ParticlI/ar SO/lItiOIi. We need the derivative y' =
we obtain, since cos 0 = I and sin 0 = O.
yeO) =
cl
= 3.0
-cl
and
sin x
+
y' (0)
= C2 =
c2 cos x. From this and the initial values
-0.5.
This gives as the solution of our initial value problem the particular solution
y = 3.0 cos x - 0.5 sin x.
Figure 28 shows that at x = 0 it ha, the value 3.0 and the slope -0.5, so that its tangent intersects the x-axis
•
at x = 3.010.5 = 6.0. (The scales on the axes differ!)
y
3
2
0
-1
-2
-3
Fig. 28.
~
I
V6'VJ'
Particular solution and initial tangent in Example 4
Observation. Our choice of )'1 and )'2 was general enough to satisfy both initial
conditions. Now let us take instead two proportional solutions )'1 = cos x and
SEC. 2.1
49
Homogeneous Linear ODEs of Second Order
Y2
= k cos x, so that
y I /.\'2
= 11k =
COilS/'
Then we can write y
=
CIY!
+
C2Y2
in the
form
)' = c i cos X + c2(k cos x)
=
C cos x
where
Hence we are no longer able to satisfy two initial conditions with only one arbitrary
constant C. Consequently, in defining the concept of a general solution, we must exclude
proportionality. And we see at the same time why the concept of a general solution is of
importance in connection with initial value problems.
D E FIN I T ION
I
General Solution, Basis, Particular Solution
A general solution of an ODE (2) on an open interval I is a solution (5) in which
Yt and Y2 are solutions of (2) on I that are not proportional, and C I and C2 are arbitrary
constants. These YI, Y2 are called a basis (or a fundamental system) of solutions
of (2) on 1.
A particular solution of (2) on I is obtained if we assign specific values to CI
and C2 in (5).
For the definition of an interval see Sec. 1.1. Also, C I and C2 must sometimes be restIicted
to some interval in order to avoid complex expressions in the solution. Furthermore, as
usual, Yl and )'2 are called proportional on I if for all x on I,
(6)
or
(b)
Y2 = l.vI
where k and I are numbers, zero or not. (Note that (a) implies (b) if and only if k =1= 0).
Actually, we can reformulate our definition of a basis by using a concept of general
importance. Namely, two functions )'1 and Y2 are called linearly independent on an
interval I where they are defined if
everywhere on I implies
(7)
And Yl and Y2 are called linearly dependent on I if (7) also holds for some constants
k I , k2 not both zero. Then if kl =t= 0 or k2 =1= 0, we can divide and see that YI and Y2 are
proportional,
or
In contrast, in the case of linear independence these functions are not proportional because
then we cannot divide in (7). This gives the following
DEFINITION
Basis (Reformulated)
A basis of solutions of (2) on an open interval I is a pair of linearly independent
solutions of (2) on I.
If the coefficients p and q of (2) are continuous on some open interval I, then (2) has a
general solution. It yields the unique solution of any initial value problem (2), (4). It
CHAP. 2
)
Second-Order Linear ODEs
includes all solutions of (2) on J; hence (2) has no singular solutions (solutions not
obtainable from of a general solution; see also Problem Set 1.1). All this will be shown
in Sec. 2.6.
E X AMP L E 5
Basis, General Solution, Particular Solution
and sin x in Example 4 form a basis of solutions of the ODE)"" + )" = 0 for all \" because their quotient
is cot x
COllst (or tan x
COllSt). Hence y = Cl cos x + c2 sin x is a general solution. The solution
)" = 3.0 cos x - 0.5 sin x of the initial value problem is a particular solution.
•
COS \"
E X AMP L E 6
"*
"*
Basis, General Solution, Particular Solution
Verify by substitution that)"1 = eX and)"2 = e -x are solutions of the ODE ,," - y = O. Then solve the initial
value problem
y" - y = 0,
y(O) = 6,
y'(O) =
-2.
(ex)" - eX = 0 and (e- x )" - e- x = 0 shows that eX and e- x are solutions. They are not
proportional. eXle- x = e 2x COllst. Hence eX, e- x form a basis for aUx. We now write down the corresponding
general solution and its derivative and equate their values at 0 to the given initial conditions,
Solution.
"*
= cl + c2 = 6.
\'(0)
By addition and subtraction. cl = 2. c2 = 4, so that the allswer is y = 2e x + 4e -.<. This is the particular solution
satisfying thc two initial conditions.
•
Find a Basis if One Solution Is Known.
Reduction of Order
It happens quite often that one solution can be found by inspection or in some other way.
Then a second linearly independent solution can be obtained by solving a first-order ODE.
This is called the method of reduction of order. l We first show this method for an example
and then in general.
E X AMP - 7
Reduction of Order if a Solution Is Known. Basis
Find a basis of solutions of the ODE
(x 2 - X)y" - xy' + Y = O.
yr
Inspection shows that Yt = x is a solution because )'~ = 1 and
= O. so that the first term
vanbhes identically and the second and third terms cancel. The idea of the method is to substitute
Solution.
y =
ll)"l
= llX,
y' =
U'(
+
,," = tt"X
Lt.
+ 2,,'
into the ODE. This gives
(x
ux and
-XII
2
-
x)(u"x
+ 2/1')
- r(u'x
+
ll)
+ U.1:
=
O.
cancel and we are left with the following ODE. which we divide by x. order. and simplify,
{X
2
-
xlu"
+ (x
- 2)11' = O.
ICredited to the great mathematician JOSEPH LOUIS LAGRANGE (1736-1813). who was born in Turin.
of French extraction. got his first professorship when he was 19 (al the Military Academy of Turin). became
director of the mathematical section of the Berlin Academy in 1766. and moved to Paris in 1787. His important
major work was in the calculus of variations. celestial mechanics, general mechanics (Mecallique a/Ja/ytique,
Paris, 1788), differential equations, approximation theory, algebra, and number theory.
SEC. 2.1
51
Homogeneous Linear ODEs of Second Order
This ODE is of first order in v
gives
dv
x - 2
= -
v
-2--
x - x
=
dx =
u', namely, (x 2
( 1 2)
---~
x
x-I
x) v'
-
dx
+ (x - 2) v
In
'
Ivl
=
= O. Separation of variables and integration
In
Ix - II -
2 In
Ix - 11
Ixl
= In -~2~
x
We need no con,tant of integration because we want to obtain a particular solution: similarly in the next
integration. Taking exponents and integrating again, we obtain
v
x-I
=
--2- =
x
~
""2.
-
x
x
u
=
Iv
dx
=
Ixl + ~x .
In
hence
.1'2 =
lIX =
x In
Ixl +
I.
Since Y1 = x and Y2 = x In Ixl + 1 are linearly independent (their quotient is not constant), we have obtained
a basis of solutions. valid for all positive x.
•
In this example we applied reduction of order to a homogeneous linear ODE [see (2)]
y"
+ p(x)y' +
q(x)y
= O.
Note that we now take the ODE in standard form, with y", not f(x)y"-this is essential
in applying our subsequent formulas. We assume a solution .VI of (2) on an open interval
I to be known and want to find a basis. For this we need a second linearly independent
solution .\"2 of (2) on 1. To get Y2, we substitute
y
=
Y2
=
y' =
UY1,
y~ = U'YI
+
uy~,
into (2). This gives
(8)
Collecting terms in u", u', and u, we have
Now comes the main point. Since )'1 is a solution of (2), the expression in the last
parentheses is zero. Hence u is gone, and we are left with an ODE in u' and u". We divide
this remaining ODE by )'1 and set u' = U, u" = V',
U"
+
u'
2y~
+ PYI
=
0,
,
V +
thus
Y1
(2V~
-'- + p
)V = O.
Y1
This is the desired first-order ODE, the reduced ODE. Separation of variables and
integration gives
dV
V
-=~
(2)'; )
- + p dx
and
.v 1
In
By taking exponents we finally obtain
(9)
V = -
1
Yl
2
e -fpdx.
Ivi
52
CHAP. 2
Second-Order Linear ODEs
Here U = u', so that u =
IV dx.
Hence the desired second solution is
Y2
The quotient )'21Yl = u =
a basis of solutions .
_...
.•. -........
.
........
,--....
11-61
= Yl
JV dx.
-.
..,.
GENERAL SOLUTION. INITIAL VALUE
PROBLEM
2),' + 2y = 0, e- x cos x, e- x sin x,
yeO) = I. /(0) = -1
4. y" - 6y' + 9y = 0, e 3x , xe 3x , -,,(0) = -1.4,
y' (0) = 4.6
5. x 2y" + .n-' - 4v = 0, x 2, x- 2, v(l) = II,
y'(1)=-=-6
.
.
6.
Yl 11
IV dx cannot be constant (since v> 0), so that Yl and Y2 fonn
(More in the next problem set.) Verify by substitution that
the given functions fonn a basis. Solve the given initial
value problem. (Show the details of your work.)
1. y" - 16y = 0, e 4x , e- 4x
yeO) = 3, /(0) = 8
2. y" + 25y = O. cos 5x. sin 5x. yeO) = 0.8,
),'(0) = -6.5
3. y"
=
+
x\" - 7rr' + 15)' = 0, x3 , x5 , y(l) = O.~.
/(1) = 1.0
[7-141 LINEAR INDEPENDENCE AND DEPENDENCE
Are the following functions linearly independent on the
given interval?
7. x, x In x (0 < r < 10)
8. 3x 2 , 2x n (0 < X < I)
9. e ax , e- ax (any interval)
10. cos 2 x, sin2 \. (any interval)
11. In x, In x 2 (x > 0)
12. x - 2, x + 2 (-2 < x < 2)
13. 5 sin x co~ x. 3 sin 2x (x > 0)
14. 0, sinh TTX (x > 0)
REDUCTION OF ORDER is important because it gives a
simpler ODE. A second-order ODE F(x, y, y', y") = 0, linear
or not, can be reduced to first order if y does not occur
explicitly (Prob. 15) or if x does not occur explicitly (Prob.
16) or if the ODE is homogeneoll~ linear and we know a
solution (see the text).
15. (Reduction) Show that F(x, y' , y") = 0 can be reduced
to first order in ;: = y' \from which y follows by
integration). Give two examples of your own.
16. (Reduction) Show that F(y, y'. y") = 0 can be reduced
to a first-order ODE with y as the independent variable
and y" = (d;:idr):. where z = y'; derive this by the
chain rule. Give two examples.
[17-221 Reduce to first order and solve (showing each
step in detail).
17. y" = ky'
18. / ' = I + /2
19. -"y" = 4y'2
20. xy"
21. -,,"
+ 2y' + xy
+ /3 siny =
=
0,
YI = X-I
cos x
0
22. (I - x 2 )y" - 2xy'
+
2.\'
=
0,
)'1
=
X
23. (Motion) A small body moves on a straight line. Its
velocity equals twice the reciprocal of its acceleration.
If at t = 0 the body has distance I m from the origin
and velocity 2 m/sec, what are its distance and velocity
after 3 sec?
24. (Hanging cable) It can be shown that the curve y(x)
of an inextensible flexible homogeneous cable
hanging between two fixed points is obtained by
solving y"
=
k~, where thc constant k depends
on the weight. This curve is called a catellary (from
Latin catella = the chain). Find and graph y(x).
assuming k = I and those fixed points are (-1,0) and
(1, 0) in a vertical .l}'-plane.
25. (Curves) Find and sketch or graph the curves passing
through the origin with slope I for which the second
derivative is proportional to the first.
26. WRITING PROJECT. General Properties of
Solutions of Linear ODEs. Write a short essay (with
proofs and simple examples of your own) that includes
the following.
(a) The superposition principle.
(b) y;: 0 is a solmion of the homogeneous equarion
(2) (called the trivial solution).
(cl The sum y = YI + )'2 of a solution
Y2 of (2) is a solmion of (1).
)'1
of (1) and
(d) Explore possibilities of making further general
statements on solutions of (1) and (2) (sums.
differences, multiples).
27. CAS PROJECT. Linear Independence. Write a
program for testing linear independence and
dependence. Try it out on some of the problems in this
problem set and on examples of your own.
SEC. 2.2
2.2
51
Homogeneous Linear ODEs with Constant Coefficients
Homogeneous Linear ODEs
with Constant Coefficients
We shall now consider second-order homogeneous linear ODEs whose coefficients a and
b are constant,
(1)
y"
+ ay' + by =
O.
These equations have imp0l1ant applications, especially in connection with mechanical
and electrical vibrations, as we shall see in Secs. 2.4, 2.8, and 2.9.
How to solve (I)? We remember from Sec. 1.5 that the solution of the first-order linear
ODE with a constant coefficient k
y'+ky=O
is an exponential function y
function
= ce- kx• This gives us the idea to try as a solution of (1) the
(2)
Substituting (2) and its derivatives
and
into our equation (1), we obtain
(A2
+
aA
+ b)eAX =
o.
Hence if A is a solution of the important characteristic equation (or auxiliary equation)
(3)
A2
+
aA
+
b
=
0
then the exponential function (2) is a solution of the ODE (1). Now from elementary
algebra we recall that the roots of this quadratic equation (3) are
(4)
A1
=
I(-a
2
+ Va 2
-
4b) '
(3) and (4) will be basic because our derivation shows that the functions
and
(5)
are solutions of (1). Verify this by substituting (5) into (1).
From algebra we f1l11her know that the quadratic equation (3) may have three kinds of
roots, depending on the sign of the discriminant a 2 - 4b, namely,
(Case I)
(Case II)
(Case III)
Two real roots if a 2 - 4b > 0,
A real double roOT if a 2 - 4b = 0,
Complex conjugate roots if a 2 - 4b < O.
54
CHAP. 2
Second-Order Linear ODEs
Case I. Two Distinct Real Roots
"-1 and "-2
In this case, a basis of solutions of (I) on any interval is
and
because )'1 and )'2 are defined (and real) for all x and their quotient is not constant. The
corresponding general solution is
(6)
E X AMP L E 1
General Solution in the Case of Distinct Real Roots
We can now solve -,," - y = 0 in Example 6 of Sec. 2.1 systematically. The characteristic equation is
A2 - 1 = O. Its roob are A1 = I and A2 = - I. Hence a basis of solutions is e'" and e -x and gi ves the same
general solution as before,
•
E X AMP L E 2
Initial Value Problem in the Case of Distinct Real Roots
Sol ve the initial value problem
/' + y' - 2)'
Solution.
= 0,
y'(O) = -5.
)'(0) = 4,
Step 1. Gel1eral SOlllliol1. The characteristic equation is
A2 + A - 2 = O.
Its roots are
and
"-2
=l(-I -
V9)
=-2
so that we obtain the general solution
Slep 2. Particlilar SOllllioll. Since /(x)
conditions
=
cle x - 2c2e-2x. we obtain from the general solution and the initial
y(O)
y' (0)
=
Cj
= cl
+ c2 = 4,
2C2 = -5.
Hence c 1 = I and c2 = 3. This gives the lI11SII'e1' Y = eX + 3e -2x. Figure 29 shows that the curve begins ar
y = 4 with a negative slope (-5, but note that the axes have different scales!), in agreement with the initial
conditions,
•
y
8
:~
2
°O~~O~,~5---7---1~.~5---=2---x
Fig. 29.
Solution in Example 2
SEC. 2.2
55
Homogeneous Linear ODEs with Constant Coefficients
Case II. Real Double Root A = - 0/2
If the discriminant a 2 - 4b is zero, we see directly from (4) that we get only one root,
A = Al = 11.2 = -al2. hence only one solution.
To obtain a second independent solution )'2 (needed for a basis), we use the method of
reduction of order discussed in the last section, setting Y2 = UY1' Substituting this and its
derivatives y~ = U')'l + uy~ and Y~ into (1), we first have
(/1'\1
+ 2u' Y~ + uy~) + a(u' Y1 + It:r~) +
bUYl =
o.
Collecting terms in u", u', and u, as in the last section, we obtain
The expression in the last parentheses is zero, since .\'1 is a solution of (1). The expression
in the first parentheses is zero. too. since
We are thus left with U"Y1 = O. Hence u" = O. By two integrations, u = C1X + C 2 . To
get a second independent solution Y2 = UY1, we can simply choose C1 = 1, C2 = 0 and
take u = x. Then Y2 = .lYl' Since these solutions are not proportional, they form a basis.
Hence in the case of a double root of (3) a basis of solutions of (l) on any interval is
The corresponding general solution is
Warning.
of (l).
E X AMP L E 1
If A is a simple root of (4), then
(C1
+
c2x)eAcC with
C2
* 0 is not a solution
General Solution in the Case of a Double Root
The characteristic equalion of the ODE yo" + 6y' + 9y = 0 is A2 + 6A + 9 = (A + 3)2 = O. It has the double
root A = -3. Hence a basis is e- 3", and xe- 3x . The conesponding general solution is y = (cl + c2x)e-3x. •
E X AMP L E 4
Initial Value Problem in the Case of a Double Root
Solve the initial value problem
/' + y' + 0.25y
= 0,
yeo) =
3.0,
Solution. The characteristic equation is A2 + A + 0.25 = (A
This gives the general solution
We need its derivative
y' (0) = -
+ 0.5)2
=
3.5.
O. It has the double root A = -0.5.
56
CHAP. 2
Second-Order Linear ODEs
From this and the initial
yeO) =
condition~
we obtain
3.0.
Cl =
/(0) =
0.5cl = -3.5:
C2 -
hence
C2
=
-2.
•
The particular solution of the initial value problem is " = (3 - 2x)e -O.5x. See Fig. 30.
y
3
2
\
o 1--"-;:-:zL____
- -.L_-~-L--.L8-=="----'----'L.-4
-1
Solution in Example 4
Fig. 30.
Case III. Complex Roots
-10 + iwand -10 -
iw
This case occurs if the discriminant a 2 - 4b of the characteristic equation (3) is negative.
In this case, the roots of (3) and thus the solutions of the ODE (I) come at first out
complex. However, we show that from them we can obtain a basis of real solutions
Yl = e- ax/ 2 cos wx.
(8)
Y2
= e- ax/ 2 sin wx
(w>
0)
!a
2
. It can be verified by substitution that these are solutions in the
where ~ = b present case. We shall derive them systematically after the two examples by using the
complex exponential function. They form a basis on any interval since their quotient
cot wx is not constant. Hence a real general solution in Case II] is
(9)
E X AMP L E 5
=
y
e- ax/ 2 (A cos wx
+ B sin wx)
(A, B arbitrary).
Complex Roots. Initial Value Problem
Solve the initial value problem
y" +
DAy'
+
9.04y = D.
yeO) =
Solution.
o.
Step 1. General SOllltioli. The characteristic equation is
-0.2 ± 3i. Hence w = 3. and a general solution (9) is
)' =
y' (0)
}..2
=
+ 0.4}" + 9.04 = O. It has the roots
e- o.2x(A co~ 3x + B sin 3x).
Step 2. Pm1iclIiar soilltioll. The first initial condition gives y(O) = A
O 2x
)' = Be- .
sin 3x. We need the derivative (chain rule!)
y'
= 3.
=
O. The remaining expression is
B(-O.2e- O. 2 1: sin 3x + 3e-O.2x cos 3x).
From this and the second imtial condition we obtain
y' (0) =
3B
=
3. Hence B
=
I. OUf solution is
y = e -O.2x sin 3x.
Figure 31 shows \' and the curves of e- O .2x and _e- O.2x (dashed), between which the curve of)' oscillates.
Such "damped vibrations" (with x = t being time) have important mechanical and electrical applications. as we
shall soon see (in Sec. 2.4).
•
SEC. 2.2
57
Homogeneous Linear ODEs with Constant Coefficients
y
1.0
IT"
0.5"
I
n', ,
/1 ~ I\'~_
o
15·· 20
30
25
x
-0.5
-1.0
Fig. 31.
E X AMP L E 6
Solution in Example 5
Complex Roots
A general solution of the ODE
(lU
constant, not zero)
is
y = A cos
With
lU
ld,
+ B sin
lUX.
•
= 1 this confirms Example 4 in Sec. 2.1.
Summary of Cases I-III
[ Case
Basis of (1)
I
Distinct real
A10 A2
II
Real double root
A = -~a
e- a;</2,
III
Complex conjugate
Al = -~a + iw,
A2 = -~a - iw
e -a~'/2
e- ax/ 2
I
L
I
i
Roots of (2)
eA1X,
eA2X
xe-a~'/2
General Solution of (1)
y =
y
=
CIe'''l~'
(Cl
+
+
C2C"'2X
c2x )e- ax / 2
-
[
cos wx
sin wx
y =
e- a :li2 (A
cos
wx
+ B sin wx)
It is very interesting that in applications to mechanical systems or electrical circuits,
these three cases correspond to three different forms of motion or flows of current,
respectively. We shall discuss this basic relation between theory and practice in detail in
Sec. 2.4 (and again in Sec. 2.8).
Derivation in Case III.
Complex Exponential Function
If verification of the solutions in (8) satisfies you, skip the systematic derivation of these
real solutions from the complex solutions by means of the complex exponential function
e2 of a complex variable z = r + it. We write r + it, not x + iy because x and y occur
in the ODE. The definition of e Z in terms of the real functions eT , cos t, and sin t is
(10)
58
CHAP. 2
Second-Order Linear ODEs
This is motivated as follows. For real z = r, hence t = 0, cos 0 = I. sin 0 = 0, we get
the real exponential function eT • It can be shown that eZ , +Z2 = eZ1 eZ2 , just as in real. (Proof
in Sec. 13.5.) Finally, if we use the Maclaurin series of e Z with z = it as well as ;2 = -1,
;3 = -;, i4 = 1, etc., and reorder the terms as shown (this is permissible, as can he proved),
we obtain the series
(it)2
(it)3
(it)4
(it)5
2!
3!
4!
5!
+ it + - - + - - + - - + - - + ...
=
;~
I -
+
:~
- + ... + i
(t - ~~
+
~~
- + ... )
= cos t + i sin t.
(Look up these real series in your calculus book if necessary.) We see that we have obtained
the fonnula
eit = cos t + i sin t,
(11)
called the Euler fonnula. Multiplication by eT gives (10).
For later use we note that e- it = cos (-t) + i sin (-t)
addition and subtraction of this and (II),
(12)
cos t
1.
(e,t
= "2
+
.
e-'t),
sin t
= cos t
1.
=-
2i
- i sin t. so that by
.
(e't - e- tt ),
After these comments on the definition (10), let us now turn to Ca~e Ill.
In Case III the radicand {/2 - 4b in (4) is negative. Hence 4b - {/2 is positive and,
using -v=T = i, we obtain in (4)
with w defined as in (8), Hence in (4),
A} = ~a
Using (10) with r
=
+ iw
-~ax and t
e~'X
=
and. similarly,
=
A2 = ~a - iw.
wx, we thus obtain
e- CaI2 )x+iwx
=
e-taI2h(cos wx
+ i sin wx)
We now add these two lines and multiply the result by ~. This gives .\', as in (8). Then
we subtract the second line from the first and multiply the result by I1(2i). This gives .\'2
as in (8). These results obtained by addition and multiplication by constants are again
solutions, as follows from the superposition principle in Sec. 2.1. This concludes the
derivation of these real solutions in Case 111.
SEC. 2.3
Differential Operators.
Optional
59
- .... - =---I_l~
GENERAL SOLUTION
Find a general solution. Check your answer by substitution.
1. y" - 6 .. ' - 7)' = 0
2. lOy" - 7/ + 1.2y = 0
3. 4y" - 20y' + 2Sy = 0
4. y"
+
+
417y'
417 2 y = 0
7. y"
8. y"
10. y" - 2.1'
+
13. y" -
12. y"
= 0
144y = 0
30. y"
14. y"
19.
+ hy
+ 2.4y' + 4.0)' =
+ y' - 0.96y =
0
0
0 for the given basis.
16. e O. 5x , e- 3 . 5x
18. 1, e- 3x
e<-l+i)X, e-(1+i)x
INITIAL VALUE PROBLEMS
Solve the initial value problem. Check that your answer
satisfies the ODE as well as the initial conditions. (Show
the details of your work.)
21. y" - 2y' - 3y = 0, yeO) = 2, /(0) = 14
y"
y"
+
+
lOy"
y"
+
lOy"
2.3
=
2Sy
=
O.
y(O) =
=
O. y' (0)
=
40
0, yeO) = 0, y' (0) = 20
l.S
S.6)' = 0, y(O) = 4, / (0) = - 3.8
35. (Verification) Show by substitution that.h in (8) is a
solution of (1 ).
0, yeO)
+ l8y' +
0
(D) Limits. Double roots should be limiting cases of
distinct roots AI, A2 as, say, A2 ~ AI' Experiment with
this idea. (Remember I'H6pital's rule from calculus.)
Can you aITive at xe A1X ? Give it a try.
+ Y = 0, yeO) = 4, /(0) = -6
4/ + S)" = 0, yeO) = 2, /(0) = -S
- SOy' + 6Sy = 0, yeO) = l.S, y' (0) =
2y'
17y'
-2, /(0) = -12
+ Y = 0, yeO) = 3.2, y' (0) =
+ (k 2 + w 2 )y = 0, yeO) = 1,
(A) Coefficient fonnulas. Show how a and b in (I)
can be expressed in tenns of Al and A2' Explain how
these formulas can be used in constructing equations
for given bases.
(B) Root zero. Solve y" + 4y' = 0 (i) by the present
method, and (ii) by reduction to first order. Can you
explain why the result must be the same in both cases?
Can you do the same for a general ODE y" + aJ' = O?
(C) Double root. Verify directly that xe AX with
A = -a/2 is a solution of (1) in the case of a double
root. Verify and explain why y = e- 2x is a solution of
y" - v' - 6y = 0 but xe- 2x is not.
1_ 1-321
22.
23.
24.
25.
26.
+
-4.S
34. TEAM PROJECT. General Properties of Solutions
=
20.
e 4 ", e- 4 ."
= 0, yeO) =
4y'
y' (0) =
2ky'
y'(O) = -k
['5-201
FIND ODE
Find an ODE)''' + ay'
15. e 2x , eX
17. e Xv3 , xe x v'3
+
0, yeO) = 2,
33. (Instability) Solve y" - y = 0 for the initial conditions
y(O) = 1, y' (0) = -1. Then change the initial conditions
to yeO) = 1.001, y' (0) = -0.999 and explain why this
small change of 0.001 at x = 0 causes a large change
later, e.g., 22 at x = 10.
0
=
917 2 y
29. 20y"
32. y" - 2y' - 24y
9. y" - 2y' - S.2S}' = 0
11. y"
+ s / + 0.62S)' =
28. y" - 9y
31. y"
+ 20y' - 99," = 0
+ 2),' + S)' = 0
- y' + 2.Sy = 0
+ 2.6/ + 1.69)' = 0
5. 100y"
6. y"
27. lOy"
=
3, y'(o)
=
-17
Optional
Differential Operators.
This short section can be omitted without interrupting the flow of ideas; it will not be
used in the sequel (except for the notations D.\', D2y, etc., for y', y", etc.).
Operational calculus means the technique and application of operators. Here, an
operator is a transformation that transforms a function into another function. Hence
differential calculus involves an operator, the differential operator D, which transforms
a (differentiable) function into its derivative. In operator notation we write
(1)
Dy
=
y'
dy
dx
CHAP. 2
60
Second-Order Linear ODEs
Similarly, for the higher derivatives we write D2y = D(Dy) = / ' , and so on. For example,
D sin = cos, D2 sin = -sin, etc.
For a homogeneous linear ODE y" + ay' + by = 0 with constant coefficients we can
now introduce the second-order differential operator
L = P(D) = D2
+
+
aD
bl.
where I is the identity operator defined by Iy = y. Then we can write that ODE as
Ly = PW)y = (D 2
(2)
+
aD
+
b/)y = O.
P suggests "polynomial." L is a linear operator. By definition this means that if Ly and
Lw exist (this is the case if y and ware twice differentiable), then L(e)' + kw) exists for
any constants e and k, and
L(e}'
+
kw)
=
+
eLy
kLw.
Let us show that from (2) we reach agreement with the results in Sec. 2.2. Since
(DeA)(.x) = Ae Ax and (D 2e A)(x) = )-..2e A"', we obtain
LeA(x) = PW)eA(x) = (D 2
+
aD + bf)eA(x)
(3)
= (11. 2 + all. + b)eA"C = P(A)eA"C = O.
This confirms our result of Sec. 2.2 that e AX is a solution of the ODE (2) if and only ~f A
is a solution of the characteristic equation peA) = O.
peA) is a polynomial in the usual sense of algebra. If we replace A by the operator D,
we obtain the "operator polynomial"' P(D). The point of this operatiollal calculus is that
P(D) call be treated just like all algebraic quantity. In particular. we can factor it.
E X AMP L E 1
Factorization, Solution of an ODE
Factor P(Dl =
if -
3D - 401 and solve P{D»)" = O.
D2 - 3D - 401 = (D - 8l)(D + 51) because P = I. Now (D - 8l)y = y' - 8y = 0 has the
solution)'1 = e 8x. Similarly. the solution of (D + 5I)y = 0 is)'2 = e -5x. This is a basis of P(D)y = 0 on any
interval. From the factorization we obtain the ODE, as expected.
Solutioll.
(D - 8[)(D + 5l)y
= (D
- 8[)(y' + 5y)
= D(y' +
=
y"
5y) - 8(/ + 5.\')
+ 5/
- 8/ -
40,· =
y" -
3y' - 40y = O.
Verify that this agrees with the result ot our method in Sec. 2.2. This is not unexpected because we factored
P(D) in the same way as the characteristic polynomial PIA) = A2 - 3A - 40.
•
It was essential that L in (2) has constant coefficients. Extension of operator methods to
variable-coefficient ODEs is more difficult and will not be considered here.
If operational methods were limited to the simple situations illustrated in this
section, it would perhaps not be worth mentioning. Actually, the power of the operator
approach appears in more complicated engineering problems, as we shall see in
Chap. 6.
SEC. 2.4
Modeling: Free Oscillations (Mass-Spring System)
--_....
-.-- _...
....-............
-...
- ~
----.
-~~
12. (4D 2
13. (D 2
Apply the given operator to the given functions. (Show all
steps in detail.)
1. (D - 1)2~ eX.. xe x .. sin x
+
cosh
-
I,
ix,
sinh
eO. 4 ."",
e- 5x sin x.
5. (D - 4I)(D + 3l); x 3 - x 2,
4. (D
[
~
5/)(D -]);
ix, ex /
e 5x • x 2
e- 3x
GENERAL SOLUTION
Factor as in the text and ~olve. (Show the details.)
6. (D 2 - 5.5D + 6.66l)y = 0
7. (D + 2l)2y = 0
8. (D 2 - 0.49])y
2
9. (D + 6D + 131))' = 0
10. (lOD 2
+
2D
+
=
0
0
14. (Double root) If D2 + aD + hI has distinct roots
JL and A, show that a particular solution is
y = (elL" - eAX)/{JL - A). Obtain from this a solution
xeAx by letting JL ~ A and applying I"H6pital's rule.
15. (Linear operator) Illustrate the linearity of L in (2) by
taking e = 4, k = -6, y = e 2X , and 11' = cos 2x.
Prove that L is linear.
16. (Definition of linearity) Show that the definitIOn of
linearity in the text is equivalent to the following. If
Lly] and L[w] exist. then L[y + w] exists and L[e)'J
and L[kw] exist for all constants e and k, and
Ll\" + w] = Lb'] + L[w] as well as L[ey] = eLl\"l and
2
xeO. 4x
sin 4x,
+ 4.1D + 3.II)y =
+ 47TD + 7T2l)y =
+ 17.64w2l)y = 0
11. (D 2
APPLICATION OF DIFFERENTIAL
OPERATORS
2. 8D 2 + 2D - I;
3. D - 0.41; 2x 3
61
0
L[kw] = kL[w].
1.7/))' = 0
2.4 Modeling: Free Oscillations
(Mass-Spring System)
Linear ODEs with constant coefficients have important applications in mechanics, as we
show now (and in Sec. 2.8), and in electric circuits (to be shown in Sec. 2.9). In this section
we consider a basic mechanical system, a mass on an elastic spring ("mass-spring system,"
Fig. 32). which moves up and down. Its model will be a homogeneous linear ODE.
Setting Up the Model
We take an ordinary spring that resists compression as well extension and suspend it
vertically from a fixed support, as shown in Fig. 32. At the lower end of the spring we
--1----1~---
U nstretched
spring
-
-
-(Y=O)l---~
System In
static
equilibrium
(a)
Fig. 32.
I
(b)
Y
----
System in
motion
(c)
Mechanical mass-spring system
62
CHAP. 2
Second-Order Linear ODEs
attach a body of mass 111. We assume /11 to be so large that we can neglect the mass of the
spring. If we pull the body down a certain distance and then release it, it starts moving.
We assume that it moves strictly vertically.
How can we obtain the motion of the body, say, the displacement y(t) as function of
time t? Now this motion is detelmined by Newton's second law
(1)
Mass X Acceleration
=
nH'''
= Force
where y" = d 2y/dr2 and "Force" is the resultant of all the forces acting on the body.
(For systems of units and conversion factors, see the inside of the front cover.)
We choose the dowllward directioll as the positive direction, thus regarding downward
forces as positive and upward forces as negative.
Consider Fig. 32. The spring is first unstretched. We now attach the body. This stretches
the spring by an amount So shown in the figure. It causes an upward force Fo in the spring.
Experiments show that Fo is proportional to the stretch So' say,
(Hooke's law2 ).
(2)
k (> 0) is called the spring constant (or spring modulus). The minus sign indicates that
Fo points upward, in our negative direction. Stiff springs have large k. (Explain!)
The extension So is such that Fo in the spring balances the weight W = mg of the
body (where g = 980 cm/sec 2 = 32.17 ftlsec 2 is the gravitational constant). Hence
F0 + W = - kso + 17lg = O. These forces will not affect the motion. Spring and body are
again at rest. This is called the static eqUilibrium of the system (Fig. 32b). We measure
the displacement yet) of the body from this 'equilibrium point' as the origin y = 0,
downward positive and upward negative.
From the position y = 0 we pull the body downward. This further stretches the spring
by some amount y > 0 (the distance we pull it down). By Hooke's law this causes an
(additional) upward force FI in the spring,
FI
= -kyo
F 1 is a restoring force. It has the tendency to restore the system, that is, to pull the body
back to y = O.
Undamped System: ODE and Solution
Every system has damping--otherwise it would keep moving forever. But practically, the
effect of damping may often be negligible, for example, for the motion of an iron ball on
a spring during a few minutes. Then F 1 is the only force in (I) causing the motion. Hence
(1) gives the model 111Y" = -kyor
(3)
my"
+ ky
=
O.
2ROBERT HOOKE (1635-1703), English physicist, a forerunner of Newton with respect to the law of
gravitation.
SEC 2.4
Modeling: Free Oscillations (Mass-Spring System)
63
By the method in Sec. 2.2 (see Example 6) we obtain as a general solution
(4)
y(t) = A cos Wot
+B
sin wof.
The corresponding motion is called a harmonic oscillation.
Since the trigonometric functions in (4) have the period 27T/WO' the body executes wo!27T
cycles per second. This is the frequency of the oscillation, which is also called the natural
frequency of the system. It is measured in cycles per second. Another name for cycles/sec
is hertz (Hz).3
2
The sum in (4) can be combined into a phase-shifted cosine with amplitude C =
+ B2
and phase angle 8 = arctan (B/A),
VA
(4*)
y(t) = C cos (wof - 8).
To verify this, apply the addition formula for the cosine [(6) in App. 3.1] to (4*) and then
compare with (4). Equation (4) is simpler in connection with initial value problems,
whereas (4*) is physically more informative because it exhibits the amplitude and phase
of the oscillation.
Figure 33 shows typical forms of (4) and (4*), all corresponding to some positive initial
displacement .\'(0) [which determines A = y(O) in (4)] and different initial velocities.r' (0)
[which determine B = y' (O)/wol
0 /
"
/
\
@/
/'
/'
---
/'
/'
/
/
/
/
/'
/
/
/
/
,/
CD Positive }
@Zero
Initial velocity
® Negative
Fig. 33.
E X AMP L E 1
Harmonic oscillations
Undamped Motion. Harmonic Oscillation
If an iron ball of weight W = 98 nt (about 22 Ib) stretches a spring 1.09 m (about 43 in.), how many cycles per
minme will this mass-spring system execute? What will its motion be if we pull down the weight an additional
16 cm (abom 6 in.) and let it start with zero initial velocity?
Solutio1l. Hooke's law (2) with W a~ the force and 1.09 meter as the stretch gives W = 1.09k: thus
2
k = WII.09 = 98/1.09 = 90 [kg/sec j = 90 [nt/meter]. The mass h III = WIg = 98/9.8 = 10 [kg]. This gives
the frequency wo/(27i) = v klml(27T) = 3/(27T) = 0.48 [Hz] = 29 [cycles/min].
3HEINRlCH HERTZ (1857-1894). German physicist. who discovered electromagnetic waves. as the basis
of wireless communication developed by GUGLIELMO MARCONI (1874-1937), Italian physicist (Nobel prize
in 1909).
64
CHAP. 2
Second-Order Linear ODEs
From (4) and the initial conditions, y(O)
= A = 0.16 [meter]
y(t) = 0.16 cos 31 [meter]
and y' (0)
= woB = O. Hence the motion is
0.52 cos 31 [ft]
or
(Fig. 34).
If you have a chance of experimenting with a mass-spring system. don't miss it. You will be surprised about
the good agreement between theory and experiment. usuall) within a fraction of one percent if you measure
carefully.
•
y
0.2
0.1
0~--,-L-~~L-~--L-4-~L--r--~---
-0.1
-0.2
Fig. 34.
Harmonic oscillation in Example 1
Damped System: ODE and Solutions
We now add a damping force
F2
to our model
(5)
=
-cy'
my" = -ky, so that we have my" = -ky - cy' or
my"
+ cy' + Ie)' =
O.
Physically this can be done by connecting the body to a dashpot; see Fig. 35. We assume
this new force to be proportional to the velocity y' = dyldt, as shown. This is generally
a good approximation, at least for small velocities.
c is called the damping constant. We show that c is positive. If at some instant, y' is
positive. the body is moving downward (which is the positive direction). Hence the
damping force F2 = -cy'. always acting against the direction of motion. must be an
upward force. which means that it must be negative, F2 = -cy' < 0, so that -c < 0 and
c > O. For an upward motion, y' < 0 and we have a downward F2 = -cy > 0; hence
-c < 0 and c > 0, as before.
The ODE (5) is homogeneous linear and has constant coefficients. Hence we can solve
it by the method in Sec. 2.2. The characteristic equation is (divide (5) by m)
A2
c
A
m
+ --
Fig. 35.
+
k
=
O.
111
Damped system
65
SEC. 2.4 Modeling: Free Oscillations (Mass-Spring System)
By the usual fonnula for the roots of a quadratic equation we obtain, as in Sec. 2.2,
(6)
A2
=
-a - {3,
where
a
=-
C
2m
and
{3 = _1_
2m
Vc
2 -
4mk.
It is now most interesting that depending on the amount of damping (much, medium, or linle)
there will be three types of motion cOlTesponding to the three Cases I, II, n in Sec. 2.2:
Case [.
Case II.
Case [II.
c 2 > 411lk.
c 2 = 4mk.
c 2 < 4mk.
Distinct real roots AI, A2 .
A real double root.
Complex conjugate roots.
(Overdamping)
(Critical damping)
(Underdamping)
Discussion of the Three Cases
Case I. Overdamping
If the damping constant c is so large that c 2 > 4mk, then Al and A2 are distinct real roots.
In this case the cOlTesponding general solution of (5) is
(7)
We see that in this case, damping takes out energy so quickly that the body does not
oscillate. For t > 0 both exponents in (7) are negative because a > 0, {3 > 0, and
132 = if - kim < if. Hence both terms in (7) approach zero as t ~ 00. Practically
speaking, after a sufficiently long time the mass will be at rest at the static equilibrium
position (y = 0). Figure 36 shows (7) for some typical initial conditions.
Case II. Critical Damping
Critical damping is the border case between nonoscillatory motions (Case I) and oscillations
(Case III). It occurs if the characteristic equation has a double root, that is, if c 2 = 4mk,
y
y
\
(b)
(aJ
CD Positive
® Zero
}
Initial velocity
@Negative
Fig. 36.
Typical motions (7) in the overdamped case
(a) Positive initial displacement
(b) Negative initial displacement
66
CHAP. 2
Second-Order Linear ODEs
so that {3 = 0, Al = A2 = -a. Then the corresponding general solution of (5) i"
(8)
This solution can pass through the equilibrium position y = 0 at most once because e- at
is never zero and Cl + C2t can have at most one positive zero. If both Cl and C2 are positive
(or both negative), it has no positive zero, so that y does not pass through 0 at all. Figure
37 shows typical forms of (8). Note that they look almost like those in the previous figure.
Case III. Underdamping
This is the most interesting case. It occurs if the damping constant
c 2 < 4mk. Then {3 in (6) is no longer real but pure imaginary, say,
(9)
{3
=
where
iw*
w*
= _12m
V4111k -
C
is so small that
k
c2 =
(> 0).
171
(We write w* to reserve w for driving and electromotive forces in Secs. 2.8 and 2.9.) The
roots of the characteristic equation are now complex conjugate.
Al = -a
+
iw*,
A2 = -a - iw*
with a = C/(2m), as given in (6). Hence the corresponding general solution is
J(t)
(10)
= e-at(A cos w*t + B sin w*t) = Ce- at cos (w*t - 8)
where C2 = A2 + B2 and tan 8 = B/A, as in (4*).
This represents damped oscillations. Their curve lies between the dashed curves
y = Ce- at and y = -Ce- al in Fig. 38, touching them when w*t - 8 is an integer multiple
of 7T because these are the points at which cos (w*t - 8) equals lor-I.
The frequency is W*/(27T) Hz (hertz, cycles/sec). From (9) we see that the smaller c (> 0)
is, the larger is w* and the more rapid the oscillations become. If c approaches 0, then w*
approaches Wo = ~. giving the harmonic oscillation (4), whose frequency WO/(27T) is
the natural frequency of the system.
y
y
,
\ .......................... ce-at
/'''"---- --::-------
/
CD Positive
® Zero
...-"';"-
--
-at
-Ce
}
I nitial velocity
@Negative
ig. 37.
..,..--
Critical damping [see (8)]
Fig. 38.
Damped oscillation in
Case /1/ [see (10)]
SEC 2.4
67
Modeling: Free Oscillations (Mass-Spring System)
EX AMP L E 2
The Three Cases of Damped Motion
How doe~ the motion in Example 1 change if we change the damping constant c to one of the following three
values. with y(O) = 0.16 and /(0) = 0 as before?
(1) c =
100 kg/sec.
(Ill
c = 60 kg/sec.
(III)
c = 10 kg/sec.
Soilltioll.
It is interesting to see how the behavior of the system changes due to the effect of the damping,
which takes energy from the syslem. so that the oscillations decrease in amplitude (Case III) or even disappear
(Cases II and I).
(I) With m = 10 and k = 90, as in Example I, the model is the initial value problem
lOy"
+ 100y' +
90y = O.
/(0) ~ O.
yeO) = 0.16 [meter].
The characteristic equation is IOA + 100A + 90 = IO{A + 9)(A + 1) ~ O. It has the roots -9 and -1. This
gives the general solution
2
We also need
The initial conditions give cl + c2 = 0.16, -9'"1 the overdamped case the solution is
= o. The solution is cl
c2
=
-0.02,
c2 =
O.IS. Hence in
y = -0.02e- 9t + O.ISe- t .
11 approaches 0 as t
- 4 x. The approach is rapid: after a few seconds the solution is practically 0, that is. the
iron ball is at res\.
(III The model is as before. with c = 60 instead of 100. The characteristic equation now has the form
IOA2 + 60A + 90 = IO(A + 3)2 = O. It has the double root -3. Hence the corresponding general solution is
We also need
The initial conditions give y(Ol ~
solution is
cl
= 0.16. /
'"2 - 3,"[ = O. C2 = 0.4S. Hence in the critical case the
(0) =
y = (0.16
+ OASt)e- 3t .
It is always positive and decrea~es 10 0 in a monotone fashion.
(III) The model now is lOy" + lOy' + 90)' = o. Since c = 10 is smaller than the critical c, we shall get
oscillations. The characteristic equation is IOA2 + lOA + 90 = IO[(A + ~)2 + 9 - ~] = o. It has the complex
roots [see (4) in Sec. 2.2 with 1I = 1 and b ~ 9]
A ~ -0.5 ::': YO.52
-
9 = -0.5 ::': 2.96i.
This gives the general solution
}' = e -o.5t(A
cos 2.96t + 8 sin 2.96t).
Thus ."(0) = A = 0.16. We also need the derivative
y'
= e -O.5t( -0.5A cos 2.Y6t - 0.58 sin 2.Y6t - 2.YM sin 2.96t
Hence /(0) = -0.5A + 2.968 ~ O.
+ 2.968 cos 2.Y6tl.
8 = 0.5A/2.96 = 0.027. This gives the solution
y = e -o.5t(O.16 cos 2. libt
+ 0.027 sin 2.961)
=
o. 162e -O.5t cos (2.96t
- 0.17).
We see that these damped oscillations have a smaller frequency than the harmonic oscillations in Example 1 by
about lo/c (since 2.96 is smaller than 3.00 by abom 1o/c I. Their amplitude goes 10 zero. See Fig. 39.
•
y
0.15
0.1
0.05
--0.05
--0.1
Fig. 39.
The three solutions in Example 2
CHAP. 2
68
Second-Order Linear ODEs
This section concerned free motions of mass-spring systems. Their models are
homogeneous linear ODEs. Nonhomogeneous linear ODEs will arise as models of forced
motions, that is, motions under the influence of a "driving force". We shall study them
in Sec. 2.8, after we have learned how to solve those ODEs.
" "::=» -B"£EM_5:U-F::.4::
11-81
MOTION WITHOUT DAMPING
(HARMONIC OSCILLATIONS)
1. (Initial value problem) Find the harmonic motion (4)
2.
3.
that starts from Yo with initial velocity vo. Graph or
sketch the solutions for Wo = 71", Yo = I, and various
Vo of your choice on common axes. At what t-values
do all these curves intersect? Why?
(Spring combinations) Find the frequency of vibration
of a ball of mass 111 = 3 kg on a spring of modulus
(i) kl = 27 nt/m, (ii) k2 = 75 nt/m, (iii) on these springs
in parallel (see Fig. 40), (iv) in series, that is, the ball hangs
on one spling, which in tum hangs on the other spring.
(Pendulum) Find the frequency of oscillation of a
pendulum of length L Wig. 41), neglecting air
resistance and the weight of the rod, and assuming
to be so small that sin practically equals
(Frequency) What is the frequency of a harmonic
oscillation if the static equilibrium position of the ball
is 10 cm lower than the lower end of the spring before
the ball is attached?
(Initial velocity) Could you make a hannonic oscillation
move faster by gi\'ing the body a greater initial push?
(Archimedian principle) This principle states that the
buoyancy force equals the weight of the water
displaced by the body (partly or totally submerged).
The cylindrical buoy of diameter 60 cm in Fig. 42 is
floating in water with its axis vertical. Wben depressed
downward in the water and released, it vibrates with
period 2 sec. Wbat is its weight?
e
4.
5.
6.
e
e.
7. (Frequency) How does the frequency of a hannonic
motion change if we take (i) a spring of three times the
modulus, (ii) a heavier ball?
8. TEAM PROJECT. Harmonic Motions in Different
Physical Systems. Different physical or other systems
may have the same or similar models, thus showing the
ullifyillg power of mathematical methods. Illustrate
this for the systems in Figs. 43-45.
(a) Flat spring (Fig. 43). The spring is horizontally
clamped at one end, and a body of weight 25 nt (about
5.6Ib) is attached at the other end. Find the motion of
the system, assuming that its static equilibrium is 2 cm
below the horizontal line, we let the system start from
this position with initial velocity 15 cm/sec, and
damping is negligible.
(b) Torsional vibrations (Fig. 44). Undamped
torsional vibrations (rotations back and forth) of a wheel
attached to an elastic thin rod are modeled bv the ODE
loe " + Ke = 0, where e is the angle measured from the
state of equilibrium, 10 is the polar moment of inertia of
the wheel about its center, and K is the torsional stiffness
of the rod. Solve this ODE for Kilo = 17.64 sec- 2 , initial
angle 45°, and initial angular velocity 15° sec-I.
(c) Water in a tube (Fig. 45). What is the frequency
of vibration of 5 liters of water (about 1.3 gal) in a
U-shaped tube of diameter 4 cm, neglecting friction?
.
_____
~
n
-r~Fig. 43.
t
Flat spring (Project 8a)
(y=o)
Body of
massm
F·li!. 40. Parallel
springs (Problem 2)
Fig. 41. Pendulum
(Problem 3)
Fig. 44. Torsional
vibrations (Project 8b)
Water
level
Fig. 42.
Buoy (Problem 6)
19-171
Fig. 45.
Tube (Project Be)
DAMPED MOTION
9. (Frequency) Find an approximation formula for w* in
terms of Wo by applying the binomial theorem in (9)
and retaining only the first two terms. How good is the
approximation in Example 2, III?
SEC. 2.5
69
Euler-Cauchy Equations
10. (Extrema) Find the location of the maxima and
minima of)' = e- 2t cos ( obtained approximately from
a graph of .1' and compare it with the exact values
obtained by calculation.
(a) Avoiding ullllecessary generality is part of good
modeli1lg. Decide that the initial value problems (A)
and (B),
(A)
11. (Maxima) Show that the maxima of an underdamped
motion occur at equidistant (-values and find the
distance.
y"
+
cy'
+Y
= 0,
yeO) = 1,
y'(O) = 0
(B) the same with different c and y' (0) = -2 (instead
of 0), will give practically as much information as a
problem with other m, k, yeO), y' (0).
12. (Logarithmic decrement) Show that the ratio of two
consecutive maximum amplitndes of a damped oscillation
( 10) is constant, and the natnral logarithm of this ratio,
called the logarithmic decrel1lellt. equals j. = 27Texlw*.
Find .1 for the solutions of .1''' + 2y' + 5)' = O.
(b) COllsider (A). Choose suitable values of c, perhaps
better ones than in Fig. 46 for the transition from Case
III to II and I. Guess c for the curves in the figure.
(c) Time to go to rest. Theoretically, this time is
infinite (why?). Practically, the system is at rest when
its motion has become very small, say, less than 0.1 %
of the initial displacement (this choice being up to us),
that is in our case.
13. (Shock absorber) What is the smallest value of the
damping constant of a shock absorber in the suspension
of a wheel of a car (consisting of a spring and an absorber)
that will provide (theoretically) an oscillation-free ride
if the mass of the car is 2000 kg and the spring constant
equals 4500 kg/sec 2 ?
(1 I) ly(t)1
14. (Damping constant) Consider an underdamped
< 0.001 for all ( greater than some tl'
In engineering constructions, damping can often be varied
without too much trouble. Experimenting with your
graphs. frnd empirically a relation between tl and c.
motion of a body of mass III = 2 kg. If the time
between two consecutive maxima is 2 sec and the
maximum amplitude decreases to of its initial value
after 15 cycles. what is the damping constant of the
system?
!
(d) Solve (A) a1lalytically. Give a rea~on why the
solution c of Y«(2) = -0.001, with t2 the solution of
y' (t) = O. will give you the best possible c satisfying (11).
(e) Consider (B) empirically as in (a) and (b). What
is the main difference between (B) and (A)?
15. (Initial value problem) Find the critical motion (8)
that starts from Yo with initial velocity vo. Graph
solution curves for ex = 1, Yo = I and several Vo such
that (i) the curve does not intersect the t-axis, (ii) it
intersects it at ( = L, 2, ... ,5. respectively.
16. (Initial value problem) Find the overdamped motion
(7) that starts from Yo with initial velocity Vo.
10
17. (Overdamping) Show that in the overdamped case, the
body can pass through y = 0 at most once.
18. CAS PROJECT. Transition Between Cases I. II, Ill.
Study this transition in terms of graphs of typical
solutions. (Cf. Fig. 46.)
2.5
Fig. 46.
CAS Project 18
Euler-Cauchy Equations
Euler-Cauchy equations4 are ODEs of the form
(1)
x 2 }'''
+ ax}" +
by
=
0
4LEONHARD EULER (1707-1783) was an enormously creative Swiss mathematician. He made fundamental
contributions to almost all branches of mathematics and its application to physics. His important books on algebra
and calculus contain numerous basic results of his own research. The great French mathematician AUGUSTIN
LOUIS CAUCHY (1789-1857) is the father of modem analysis. He is the creator of complex analysis and had
great influence on ODEs, PDEs, infinite series, elasticity theory, and optics.
70
CHAP. 2
Second-Order Linear ODEs
with given constants a and b and unknown -"(Jo). We substitute
"= xfn
(2)
and its derivatives y' =
m
111X
-
l
and -,,"
=
1)x",-2 into (1). This gives
/11(111 -
We now see that (2) was a rather natural choice because we have obtained a common
factor xm. Dropping it, we have the auxiliary equation 11l(m - I) + am + b = 0 or
(3)
111
2
+
(a -
1)11l
+ b = O.
tNote: a - I, not a.)
Hence y = xnt is a solution of (1) if and only if I1l is a root of (3). The roots of (3) are
(4)
Case I.
1112 = !(l - a) -
If the roots
1111
and
1112
V~(l
- a)2 -
b.
are real and different. then solutions are
and
They are linearly independent since their quotient is not constant. Hence they constitute
a basis of solutions of (I) for all x for which they are real. The corresponding general
solution for all these x is
(5)
E X AMP L E 1
(Cl, C2
arbitrary).
General Solution in the Case of Different Real Roots
The Euler-Cauchy equation
x 2 y" + 1.5xy' - 0.5.1' = 0
has the auxiliary equlltion
1112
+ 0.5111 - 0.5
=
O.
The roots are 0.5 and -\. Hence a basis of solutions for all positive x is Yl
general solution
(Note: 0.5. not 1.5!)
= xO. 5 and Y2 =
IIx and gives the
(x> 0).
•
Case II. Equation (4) shows that the auxiliary equation (3) has a double root
=!O - a) if and only if (I - a)2 - 4b = O. The Euler-Cauchy equation (I) then
has the form
1111
(6)
A solution is h = x O - a )f2. To obtain a second linearly independent solution, we apply
the method of reduction of order from Sec. 2.1 as follows. Starting from )'2 = uy!, we
obtain for u the expression (9), Sec. 2.1, namely,
u=JUdx
where
U =
~
exp (- Jp dX) .
)'1
SEC. 2.5
71
Euler-Cauchy Equations
Here it is crucial that p is taken from the ODE written in standard form. in our case.
a
y " + -x .y
(6*)
,
+
This shows that p = alx (not ax). Hence its integral is a in x = In (xa), the exponential
function in U is Ih: a , and division by YI 2 = x 1 - a gives U = l/x, and u = In x by integration.
Thus, in this "critical case," a basis of solutions for positive x is Yl = xm and
Y2 = X 1n In x, where 111 =!O - a). Linear independence follows from the fact that the
quotient of these solutions is not constant. Hence, for all x for which )'1 and Yz are defined
and real, a general solution is
(7)
E X AMP L E 2
y
(e1
=
+
c21nx)xm,
111
= ~(l -
a).
General Solution in the Case of a Double Root
The Euler-Cauchy equation x 2v" - 5xy' + 9y = 0 has the auxiliary equation
double root III = 3, so that a general solution for all positive x is
nz2 -
6m
+9
=
O. It has the
•
Case III. The case of complex roots is of minor practical importance, and it suffices to
present an example that explains the derivation of real solutions from complex ones.
E X AMP L E 3
Real General Solution in the Case of Complex Roots
The Euler-Cauchy equation
hm; the auxiliary equation /11 2 - 0.4111 + 16.04 = O. The root" are complex conjugate. /Ill = 0.2 + 4i and
0.2 - 4i, where i = v'=T. (We know from algebra that if a polynomial with real coefficients has complex
roots. these are always conjugate.) Now use the trick of writing x = e ln ,. and obtain
nz2 =
xm1
=
xO.2+4i
=
xO.2(eln X)4i
=
xO.2/4In Xli,
xm2 = xO.2-4i = xO.2(eln X)-4i = xO.2e -(4 In xl i.
Next apply Euler's formula (11) in Sec. 2.2 with
x m1
X"'2
I =
4 In x to these two formulas. This gives
=
2
xO. [cos
(4 In x) + i sin (4 In x)],
=
xO. 2 rcos
(4 In x) - ; sin (4 In x)].
Add these two formulas. so that the sine drops uut. and divide the result by 2. Then subtract the second formula
from the first, so that the cosine drops out, and divide the result by 2i. This yields
XO. 2
cos (4 In x)
and
XO.
2
sin (4 In x)
respectively. By the superposition principle in Sec. 2.2 these are solutions of the Euler-Cauchy equation (I).
Since their quotient cot (4 In x) is not constant, they are linearly independent. Hence they form a basis of solutions,
and the corresponding real gencral solution for all positive x is
(8)
y
2
= xo. [A
cos (4 In x)
+ B sin (4Inx)l.
Figure 47 shows typical solution curves in the three cases discussed, in particular the basis functions in
•
Examples I and 3.
72
CHAP. 2
Second-Order Linear ODEs
Y,
3.0
xl. 5
Y
1.5
1.0
0.5
xl
2.0
xO. 5
0
2
x
Case I: Real roots
0
-0.5
-1.0
-1.5
x O.2 sin (4Inx)
\I
\
()\'\
O.ll
VV
1 1.4,
2
x
'\
'\
"-
x O.2 cos (4 In x)
Case III: Complex roots
Case II: Double root
Fig. 47.
E X AMP L E 4
1.5
1.0
0.5
x-1. 5 lnx
2 x
0
-0.5
-1.0
-1.5
1.0
Y
xlnx
xO.5lnx~
Euler-Cauchy equations
Boundary Value Problem. Electric Potential Field Between Two Concentric Spheres
Find the electrostatic potential v = vCr) between two concentric spheres of radii rl = 5 cm and r2 = 10 cm
kept at potentials VI = 110 Y and v2 = 0, respectively.
Physicallnjorlll{[tion. vCr) is a solution of the Euler-Cauchy equation rv" + 2v' = O. where v' = dvldr.
Solution.
vCr) =
Cl
The auxiliary equation is 1112 + III = O. It has the roots 0 and - 1. This gives the general solution
+ c2fr. From the "boundary conditions" (the potentials on the spheres) we obtain
C2
+ 5"
v(5) = cl
C2
= 110.
v(i 0) = Cl
+ 10
= O.
= 110. C2 = llOO. From the second equation. Cl = -c2f10 = -llO. Allswer:
+ llOOIr Y. Figure 48 shows that the potential is not a straight line. as it would be for a potential
By subtraction. c2flO
vCr) = -llO
between two parallel plates. For example, on the sphere ofradius 7.5 cm it is not 11012 = 55 Y, but considerably
less. (Whm is it?)
•
v
100
80
,
"'
60
40
20
0
5
6
Fig. 48.
11-101
8
111-151
GENERAL SOLUTION
Find a real general solution, showing the details of your
work.
2. 4x 2y" + 4xy' - y = 0
3. x 2 y" - 7xy' + 16y = 0
4. x 2 y" + 3x-,,' + y = 0 5. x 2 )'''
6. 2x 2 )"" + 4x)" + 5y = 0
7
10
9
r
Potential v(r) in Example 4
INITIAL VALUE PROBLEM
Solve and graph the solution, showing the details of your
work.
11. x 2y" - 4xy'
+ 6)' = 0, y(l) = I, y'(l) = 0
12. x .\''' + 3xy' + y = 0, y(\) = 4, y' (1) = -2
13. (x 2D2 + 2xD + 100.251)y = 0, y(1) = 2.
2
-
xy' + 2y
7. (lOx 2D2 - 20xD + 22.4l)y = 0
8. (4x 2D2 + l)y = 0
9. (100x 2D2
10. (I Ox 2D2 + 6xD + 0.5/)y = 0
=
0
y'(1)
+
9l)y
=
0
= -11
14. (x 2D2 - 2xD
y' (1) = 2.5
15. (xD 2 + 4D)y
+
2.251)y = 0, y(l) = 2.2,
= 0,
y(l)
= 12, y' (1) = -6
SEC. 2.6
16. TEAM PROJECT. Double Root
(A) Derive a second linearly independent solution of
by reduction of order; but instead of using (9), Sec.
2.1, perform all steps directly for the present ODE (I).
(B) Obtain x In In xby considering the solutionsx m and
x m + s of a suitable Euler-Cauchy equation and letting
(I)
s~O.
2.6
73
Existence and Uniqueness of Solutions. Wronskian
(C) Verify by substitution thatx m In x, 111 = (1 - a)/2,
is a solution in the critical case.
(D) TransfoI1n the Euler-Cauchy equation (1) into an
ODE with constant coefficients by setting x = et (x > 0).
(E) Obtain a second linearly independent solution of
the Euler-Cauchy equation in the "critical case" from
that of a constant-coefficient ODE.
Existence and Uniqueness of Solutions.
Wronskian
In this section we shall discuss the general theory of homogeneous linear ODEs
y" + p(x)y'
(1)
+ q(x)y = 0
with continuous, but otherwise arbitrary variable coefficients p and q. This will concern
the existence and form of a general solution of (1) as well as the uniqueness of the solution
of initial value problems consisting of such an ODE and two initial conditions
(2)
with given xo, Ko, and K 1 .
The two main results will be Theorem 1, stating that such an initial value problem
always has a solution which is unique, and Theorem 4, stating that a general solution
(3)
(Cl, Cz arbitrary)
includes all solutions. Hence linear ODEs with continuous coefficients have no "singular
solutions" (solutions not obtainable from a general solution).
Clearly, no such theory was needed for constant-coefficient or Euler-Cauchy equations
because everything resulted explicitly from our calculations.
Central to our present discussion is the following theorem.
THEOREM 1
Existence and Uniqueness Theorem for Initial Value Problems
If p(x) and q(x) are continuous functions on some open interval J (see Sec.
1.1) and
is in J, then the initial value problem consisting of (1) and (2) has a unique
solution y(x) on the interval 1.
Xo
The proof of existence uses the same prerequisites as the existence proof in Sec. 1.7
and will not be presented here; it can be found in Ref. [All] listed in App. 1. Uniqueness
proofs are usually simpler than existence proofs. But for Theorem 1, even the uniqueness
proof is long, and we give it as an additional proof in App. 4.
CHAP. 2
74
Second-Order Linear ODEs
Linear Independence of Solutions
Remember from Sec. 2.1 that a general solution on an open interval 1 is made up from a
basis ."1> Y2 on I, that is, from a pair of linearly independent solutions on I. Here we call
)'t, )'2 linearly independent on 1 if the equation
(4)
k) = 0,
implies
k2 = O.
We call y) • ."2 linearly dependent on 1 if this equation also holds for constants kh k2
not both O. In this case, and only in this case. YI and Y2 are proportional on I. that is (see
Sec. 2.1),
(5)
(a)
or
YI = /...)'2
(b)
)'2
=
[YI
for all x on I.
For our discussion the following criterion of linear independence and dependence of
solutions will be helpful.
THEOREM 2
Linear Dependence and Independence of Solutions
LeT the ODE (I) Izave cOllfinuous coefficients p(x) and q(x) all an open interval I.
Then two solutions YI and .1'2 of (1) on T are linearly dependent on I if and only if
their "Wronskian"
(6)
is 0 lit some Xo in 1. FlIrthermore, if W = 0 at an x = Xo in I. then W == 0 on I: hence
if there is all Xl ill 1 at wlzich W is IIOt 0, thell ."1. Y2 are linearly independent on I.
PROOF
(a) Let."1 and Y2 be linearly dependent on I. Then (5a) or (5b) holds on I. If (5a) holds, then
W(YI, .1'2)
= YI.\'~
- Y2Y~
= k)'2Y~
- Y2k.1'~
= O.
Similarly if (5b) holds.
(b) Conversely, we let W()'I' )'2) = 0 for some x = Xo and show thar this implies linear
dependence of YI and .\'2 on T. We consider the linear system of equations in the unknowns
k I , k2
(7)
kIYI(XO)
+
kl.)'~(xo)
+ k2)'~(XO)
k2 Y2(XO) = 0
=
O.
To eliminate k 2 • multiply the first equation by Y~ and the second by -Y2 and add the
resulting equations. This gives
Similarly, to eliminate kI' multiply the first equation by -Y~ and the second by YI and
add the resulting equations. This gives
SEC. 2.6
75
Existence and Uniqueness of Solutions. Wronskian
If W were not 0 at xo, we could divide by Wand conclude that kl = k2 = O. Since W is
0, division is not possible, and the system has a solution for which kl and k2 are not both
O. Using these numbers k1> k2 , we introduce the function
y(X) = k 1Yl(X)
+ k 2Y2(X),
Since (1) is homogeneous linear, Fundamental Theorem I in Sec. 2.1 (the superposition
principle) implies that this function is a solution of (I) on I. From (7) we see that it satisfies
the initial conditions )"(xo) = 0, y' (xo) = O. Now another solution of (I) satisfying the
same initial conditions is y* == O. Since the coefficients p and q of (I) are continuous.
Theorem I applies and gives uniqueness, that is, y == y*, written out
on 1.
Now since kl and k2 are not both zero, this means linear dependence of )'1> )"2 on l.
(e) We prove the last statement of the theorem. If W(xo ) = 0 at an Xo in I, we have
linear dependence of .h, Y2 on I by part (b), hence W == 0 by part (a) of this proof. Hence
in the case of linear dependence it cannot happen that W(x l ) =f. 0 at an XI in 1. If it does
happen, it thus implies linear independence as claimed.
•
Remark.
Determinants. Students familiar with second-order determinants may have
noticed that
Y~I
Y2
I
= YIY2
I
- Y2Yl'
This determinant is called the Wronski deTennina1lt 5 or, briefly, the Wronskian, of two
solutions)"1 and)"2 of U), a<; has already been mentioned in (6). Note that its four entries
occupy the same positions as in the linear system 0).
E X AMP L E 1
Illustration of Theorem 2
The functions)"1 = cos wx and
W(cos wx. sin wx) =
)"2 =
sin wx are solutions of)"" + w2 y = O. Their Wronskian is
cos.wx
I
-WSlfl wX
sin wx
wcO~
I=
YIY~ - Y2Y~
= w cos
2
lUX
+
w sin
2
wx = w.
wX
'*
Theorem 2 shows that the,e solutions are linearly independent if and only if w
O. Of course, we can see
this directly from the quotient \'2IYl = tan wx. For w = 0 we have .\"2 == 0, which implies linear dependence
(why?).
•
E X AMP L E 2
Illustration of Theorem 2 for a Double Root
A general solution of y" - 2y' + Y = 0 on any interval is y = lCI + C2X)ex. (VeIify!). The corresponding
Wronskian is not O. which shows linear independence of eX and xi'" on any interval. Namely.
•
5Introduced by WRONSKI (JOSEF MARIA HONE. 1776-1853). Polish mathematician.
CHAP. 2
76
Second-Order Linear ODEs
A General Solution of (1) Includes All Solutions
This will be our second main result, as announced at the beginning. Let us start with existence.
Existence of a General Solution
THEOREM 3
/fp(x) and q(x) are continuous on an open interval I, then (1) has a general solution
on T.
PROOF
By Theorem 1, the ODE (1) has a solution heX) on T satisfying the initial conditions
and a solution Y2(X) on T satisfying the initial conditions
The Wronskian of these two solutions has at x
=
Xo
the value
Hence, by Theorem 2. these solutions are linearly independent on l. They fonn a basis of
solutions of (1) on T, and y = ("1)'1 + C2Y2 with arbitrary c 1 - C2 is a general solution of (1)
on T, whose existence we wanted to prove.
•
We finally show that a general solution is as general as it can possibly be.
A General Solution Includes All Solutions
THEOREM 4
/f the ODE (1) has cnntinuuus cnefficients p(x) and q(x) on some open interval I,
then every solution Y = Y(x) of (1) on 1 is of the form
(8)
where Yv Y2 is any basis of solutions of (l) on 1 and C v C2 are suitable constants.
Hence (1) does not have singular solutions (that is, solutions not obtainablefrom
a general solution).
PROOF
Let y = Y(x) be any solution of (1) on I. Now, by Theorem 3 the ODE (I) has a general
solution
(9)
on 1. We have to find suitable values of C1> C2 such that y(x) = Y(x) on I. We choose any
in 1 and show first that we can find values of Cl' C2 such that we reach agreement at
xo, that is, y(xo) = Y(xo) and y' (xo) = Y' (xo). Written out in terms of (9), this becomes
Xo
(10)
(a)
Clh(Xo)
+
C2.'"2(XO)
= Y(xo)
(b)
CIY~(XO)
+
C2Y~(XO)
= Y' (xo).
SEC. 2.6
Existence and Uniqueness of Solutions. Wronskian
77
We detennine the unknowns Cl and C2' To eliminate C2, we multiply (lOa) by J~(xo) and
(lOb) by -Y2(XO) and add the resulting equations. This gives an equation for Cl' Then we
multiply (lOa) by -J~(xo) and (lOb) by Yl(XO) and add the resulting equations. This gives
an equation for C2' These new equations are as follows, where we take the values of Jb
y~. )'2' Y~' Y. y' at Xo,
Cl()'IY~ - yzY~) = Cl W(y!> .\'2) = yy~ - Y2 Y '
C2(YIJ~ - J2Y~) = C2 W(Yb )'2) = h Y' - Yy~.
Since)'b Yz is a basis, the Wronskian W in these equations is not 0, and we can solve for
C2' We call the (unique) solution Cl = C b C2 = C2 . By substituting it into (9) we
obtain from (9) the particular solution
Cl and
Now since C b C2 is a solution of (10), we see from (10) that
From the uniqueness stated in Theorem
this implies that y* and Y must be equal
everywhere on /, and the proof is complete.
•
Looking back at he content of this section, we see that homogeneous linear ODEs with
continuous variable coefficients have a conceptually and structurally rather transparent
existence and uniqueness theory of solutions. Important in itself, this theory will also
provide the foundation of an investigation of nonhomogeneous linear ODEs, whose theory
and engineering applicatiuns we shall study in the remaining four sections of this chapter.
-_ _-..
.-i. __ ._.__
..
..
11-171
~
--
..... ..-.
~--
.......
BASES OF SOLUTIONS.
CORRESPONDING ODEs. WRONSKIANS
Find an ODE (1) for which the given functions are
solutions. Show linear independence (a) by considering
quotients, (b) by Theorem 2.
1. eO. 5x , e-O. 5x
2. cos 7rX, sin 7rX
3. e kx , xe kx
5. XO. 25 , xO. 25 In
4. x 3 , x- 2
6. e 3 .4x , e- 2 . 5X
x
7. cos (2 In X), sin (2 In
8. e- 2x , xe- 2x
10.
x- 3 • x- 3
In
X)
11. cosh 2.5x, sinh 2.5x
x
12. e- 2x cos wx, e- 2x sin wx
13. e- x cos 0.8x, e- X sin 0.8x
14.
X-I
cos (In x),
X-I
sin (In x)
15. e- 2 . 5x cos 0.3x. e- 2 . 5x sin 0.3x
16. e- kx cos 7rX, e- kx sin
17. e- 3 . 8 "ITx, xe- 3 .8 "ITx
7rX
18. TEAM PROJECT. Consequences of the Present
Theory. This concems some noteworthy general
properties of solutions. Assume that the coefficients p
and q of the ODE (l) are continuous on some open
interval T. to which the subsequent statements refer.
(A) Solve y" - Y = 0 (a) by exponential functions,
(b) by hyperbolic functions. How are the constants in
the corresponding general solutions related?
(8) Prove that the solutions of a basis cannot be 0 at
the same point.
(C) Prove that the solutions of a basis cannot have a
maximum or minimum at the same point.
(D) Express (Y2/Yl) , by a fOimula involving the
Wronskian W. Why is it likely that such a formula
should exist? Use it to find Win Prob. 10.
(E) Sketch YI(X) = x 3 if X ~ 0 and 0 if x < 0,
Y2(X) = 0 if x ~ 0 and x 3 if x < O. Show linear
independence on - I < x < 1. What is their
Wronskian? What Euler-Cauchy equation do Y10 Y2
satisfy? Is there a contradiction to Theorem 2?
CHAP. 2
78
Second-Order Linear ODEs
(F) Prove Abel's formula 6
where c = W(Yl (xo), Y2(xo». Apply it to Prob. 12. Him:
Write (1) for Y1 and for )'2' Eliminate q algebraically
from these two ODEs. obtaining a first-order linear
ODE. Solve it.
W(.vrlX), Y2lx)) = c exp [ - fXp(t) dt]
Xo
2.7
Nonhomogeneous ODEs
Method of Undetermined Coefficients
In this section we proceed from homogeneous to nonhomogeneous linear ODEs
y"
(1)
+ p(x)y' + q(x)y =
rex)
where rex) =t= O. We shall see that a "general solution" of (1) is the sum of a general
solution of the corresponding homogeneous ODE
(2)
y"
+ p(x)y' + q(x)y
= 0
and a "particular solution" of 0). These two new terms "general solution of (\)" and
"particular solution of 0)" are defined as follows.
DEFINITION
General Solution, Particular Solution
A general solution of the nonhomogeneous ODE (I) on an open interval I is a
solution of the form
(3)
here. Yh = ClYl + C2Y2 is a general solution of the homogeneous ODE (2) on I and
Yp is any solution of ( 1) on I containing no arbitrary constants.
A particular solution of (I) on I is a solution obtained from (3) by assigning
specific values to the arbitrary constants Cl and C2 in .rh'
Our task is now twofold, first to justify these definitions and then to develop a method
for finding a solution yp of (I).
Accordingly, we first show that a general solution as just defined satisfies (I) and that
the solutions of 0) and (2) are related in a very simple way.
THEOREM 1
Relations of Solutions of (1) to Those of (2)
(a) The sum of a solution y of (I) 0/1 some open inten>al I alld a solution yof
(2) on I is a solution of (1) 0/1 l. In particular. (3) is a solution of (1) on l.
(b) The differellce oftH'o solutions of (1) on I is a solution of(2) on I
6 NIELS
HENRIK ABEL (1802-1829). Norwegian mathematician.
SEC 2.7
79
Nonhomogeneous ODEs
PROOF
(a) Let L[y] denote the left side of (1). Then for any solutions y of (1) and yof (2) on I,
L[y
+ y] =
L[y]
+ L[y] = r + 0 =
(b) For any solutions \" and y';' of (I) on I we have L[y - y*]
=
r.
L[ \"] - L[y*] =
r - r = O.
•
Now for homogeneous ODEs (2) we know that general solutions include all solutions.
We show that the same is true for nonhomogeneous ODEs (1).
THEOREM 2
A General Solution of a Nonhomogeneous ODE Includes All Solutions
If the
coefficients p(x), q(x), and the function r(x) in (1) are continuous on some
open interval I, then ever), solution of (I) on T is obtained by assigning suitable
values to the arbitrary constants CI and C2 in a general solution (3) of (I) on I.
PROOF
Let y* be any solution of (\) on T and Xo any x in I. Let (3) be any general solution of (1)
on T. This solution exists. Indeed, Yh = CIYI + C2Y2 exists by Theorem 3 in Sec. 2.6
because of the continuity assumption, and Yp exists according to a construction to be shown
in Sec. 2.10. Now, by Theorem I (b) just proved, the difference Y = y* - Yp is a solution
of (2) on I. At Xo we have
Theorem I in Sec. 2.6 implies that for these conditions, as for any other initial conditions
in I, there exists a unique particular solution of (2) obtained by assigning suitable values
to c l , C2 in Yh. From this and y* = Y + YP the statement follows.
•
Method of Undetermined Coefficients
Our discussion suggests the following. To solve the nonhomogeneous ODE (I) or an initial
value problel1lfor (1), we have to solve the homogeneolls ODE (2) and find any solution
yp of (1), so that we obtain a general solution (3) of (1).
How can we fmd a solution Yp of (1)? One method is the so-called method of
undetermined coefficients. It is much simpler than another, more general method (to be
discussed in Sec. 2.10). Since it applies to models of vibrational systems and electric
circuits to be shown in the next two sections, it is frequently used in engineering.
More precisely, the method of undetermined coefficients is suitable for linear ODEs
with constant coefficients a and b
(4)
y"
+ aJ' +
by
=
rex)
when rex) is an exponential function, a power of x, a cosine or sine, or sums or products
of such functions. These functions have derivatives similar to rex) itself. This gives the
idea. We choose a form for Yp similar to rex). but with unknown coefficients to be
determined by substituting that yp and its derivatives into the ODE. Table 2.1 on p. 80
shows the choice of Yp for practically important fonns of rex). Corresponding ruIes are
as follows.
CHAP. 2
80
Second-Order Linear ODEs
Choice Rules for the Method of Undetermined Coefficients
(a) Basic Rule. If rex) in (4) is one of the fUllctions in the first colU111n in
Table 2.1. choose )'p ill the same line and determine its undetermined
coefficients hy suhstituting .\"p and its deril'atil'es i1lfo (4).
(b) Modification Rule. rl a tenn in your chnice for .\"p happens to be a
solution of the homogeneous ODE corresponding to (4). lIlultiply your
choice of.\'1' by x (or by x 2 !l this solution c()rre~pollds to a double root of
the characteristic equation of the h01l1ogeneous ODE).
(C)
Sum Rule. rl rex) is a sum of functiolls ill the first column of Table 2.1.
choose for .\"p the sum of the functions ill the corresponding lines of the
second COllllll1l.
The Basic Rule applies when rex) is a single tenn. The Modification Rule helps in the
indicated case, and to recognize such a case. we have to solve the homogeneous ODE
first. The Sum Rule follows by noting that the sum of two solutions of (I) with r = rl
and r = r2 (and the same left side!) is a solution of (1) with r = 1"1 + r2' (Verify!)
The method is self-correcting. A false choice for .\"p or one with too few tenns will lead
to a contradiction. A choice with too many terms will give a correct result. with superfluous
coefficients coming out zero.
Let us illustrate Rules (a)-( c) by the typical Examples 1-3.
Table 2.1
Method of Undetermined Coefficients
Term in rex")
E X AMP L E 1
Choice for .\'p(.t)
key:r
Cey:r
kt" (n = O. L· .. )
kcos wx
k sin wx
ke"'" cos wx
ke,,:r sin wx
Knxn
+ Kn_1xn - 1 +
} K cos wx
+ K1x + Ko
+ M sin wx
} e"X(K cos wx + M sin wx)
Application of the Basic Rule (a)
Sol"e the initial value problem
.1'(0) = O.
(5)
Solutioll.
y' (0)
= 1.5 .
Step 1. Gelleral solutioll of the homogelleolls ODE. The ODEy" + Y =
y" = A cos
~
a has the general solution
+ B sin X.
v;
Step 2. SoluRon yp Of the nonhomogelleous ODE. We first try .1'1' = Kx 2 . Then
= 2K. By substitution.
2K + K\-2 = 0.00Ix 2. For this to hold for all X. the coefficient of each power of x (x 2 and ,0) must be the same
on both sides; thus K = 0.001 amI 2K = O. a contradiction.
The second line in Table 2.1 suggests (he choice
Then
2
Equating the coefficients of x . X, .\ 0 on both sides. we have K2 = 0.001. K]
Ko = -2K2 = -0.002. Tills givesyp = 0.001x 2 - 0.002, and
y =
-,"/z
+ -'"1' = A cos x + B sin x + O.OOh 2
-
=
0.002.
0, 2K2
+
Ko
= O.
Hence
SEC. 2.7
81
Nonhomogeneous ODEs
Step 3. Solutioll of the il/itial value problem. Setting x = 0 and using the first initial condition gives
yeO) = A - 0.002 = O. hence A = 0.002. By differentiation and from the second initial condition,
.1"
=
y;, + Y~ =
-A sin x + B cos x + 0.002x
and
/(0) = B
=
1.5.
This gives the answer (Fig. 49)
y = 0.002 cos x
+
1.5 sinx
+
0.001x2
-
0.002.
Figure 49 shows y as well as the quadratic parabola)')) about which y is oscillating, practically like a sine curve
•
since the cosine term is smaller by a factor of about 111000.
x
Fig. 49.
E X AMP L E 2
Solution in Example 1
Application of the Modification Rule (b)
Solve the initial value problem
(6)
y" + 3/ + 2.25y
=
-10
e-1.
5x,
y'(O) = O.
yW) = I,
Solution. Step
1. Gel/eral solutioll of the homogelleous ODE. The characteristic equation of the
homogeneous ODE is A2 + 3A + 2.25 = (A + 1.5)2 = O. Hence the homogeneou~ ODE has the general
solution
Step 2. Solutio" Yp of the "ollhomogelleous ODE. The function e-1.5x on the light would normally require
the choice Ce-1. 5x. But we see from .I'h that this function is a solution of the homogeneous ODE. which
corresponds to a double root of the characteristic equation. Hence, according to the Modification Rule we have
to multIply our choice function by x 2 . That is, we choo~
Then
We substitute these expressions into the given ODE and omit the factor e- 1 .5x . This yields
C(2 - 6x + 2.25x 2 ) + 3C(2x - 1.5x 2 ) + 2.25Cx 2
=
-10.
Comparing the coefficients of x 2 • x. xU gives 0 = 0.0 = O. 2C = -10. hence C = -5. This gives the solution
Yp = _5x 2 e- 1 .5x . Hence the given ODE has the general solution
Step 3. Solutioll of the initial value problem. Setting x = 0 in y and using the first initial condition. we obtain
.1'(0) = CI = I. Differentiation of y gives
From this and the second initial condition we have
gives the answer (Fig. 50)
y' (0) =
c2 -
1.5cl
=
O. Hence c2
=
1.5cl
=
1.5. This
The curve begins with a horizontal tangent. crosses the x-axis at x = 0.6217 (where 1 + 1.5x - 5x 2 = 0) and
•
approaches the axis ti'om below as x increases.
CHAP. 2
82
Second-Order Linear ODEs
y
1.0
0.5
Oc---';,--'------''------''------'--==-'-x
~l
2
~---5
-D.5
~
-1.0
Fig. 50.
E X AMP L E 3
Solution in Example 2
Application of the Sum Rule (c)
Solve the initial value problem
(7)
y"
Solution.
+ 2y' + 5)"
=
(,0.5"
yeo)
40 cos lOx - 190 sin lOx.
T
=
0.16.
y' (0)
=
40.08.
Step 1. General solutioll of the homogeneous ODE. The characteristic equation
A2
+
2A
+
5 = (A
+
1
+
2i){A
+
1 - 20 = 0
shows that a real general solution of the homogeneous ODE is
J'h = e -x (A cos 2,
+B
sin 2x).
Step 2. Solution of the Ilonhomogeneous ODE. We write Yp = )"1'1 +
exponential term and .\1'2 to the sum of the other twO terms. We set
Then
.1"1'2,
where J'pl corresponds to the
and
Substitution into the given ODE and omission of the exponemial factor gives (0.25 + 2,0.5
C = 116.25 = 0.16. and )"1'1 = 0.16eo. 5".
We now set )"1'2 = K cos lOx + M sin lOx. as in Table 2.1. and obtain
)'~2
=
-10K sin lOx
+ 10M cos
lOx.
\";2 =
+ 5)C =
1, hence
-lOOK cos lOx - 100M sin lOx.
Substitution into the given ODE gives for the cosine terms and for the sine tenTIS
- lOOK + 2· 10M + 5K = 40,
-100M - 2' 10K + 5M = -190
or, by simplification.
-95K
The solution is K
+
20M = 40,
-10K - 95M = -190.
= O. M = 2. Hence .1'1'2 = 2 ,in lOx. Together,
Y = Yh + Ypl +
.1'1'2 =
e- x (A co~ 2x + B
SIl1
2<)
+ 0.16,,0.5x + 2 sin lOx.
Step 3. Sollllion of the initial value problem. From y and the first initial condition. y{O) = A
hence A = O. Differentiation gives
y' =
e -xC -A cos 2x - B sin 2, - 2A sin 2x
+ 0.16
=
0.16,
+ 2B cos 2,) + 0.08eO. 5 :< + 20 cos lOx.
From this and the second initial condition we have /(0) = -A + 2B + 0.08 + 20 = 40.08, hence B
This gives the solution (Fig. 51)
Y = lOe- x sin 2x + 0.16,,°·5.< + 2 sin lOx.
=
10.
The firsllerrn goes to 0 relatively fas!. When x = 4. it is practically O. as the dashed curves::': lOe -x + 0.16eo. 5r
show. From then on, the last term, 2 sin lOx, gives an oscillation about 0.16eo. 5 ,", the monotone increasing
dashed curve.
•
SEC 2.7
83
Nonhomogeneous ODEs
y
10
,,
-4
Fig. 51.
Solution in Example 3
Stability. The following is important. If (and only if) all the roots of the characteristic
equation of the homogeneous ODE y" + ay' + by = 0 in (4) are negative, or have a negative
real part, then a general solution.Vl, of this ODE goes to 0 as x ~ (Xl, so that the "transient
solution" Y = Yh + yp of (4) approaches the "steady-state solution" yp' [n this case the
nonhomogeneous ODE and the physical or other system modeled by the ODE are called
stable; otherwise they are called unstable. For instance, the ODE in Example 1 is unstable.
Basic applications follow in the next two sections.
[1-141
GENERAL SOLUTIONS OF
NONHOMOGENEOUS ODEs
Find a (real) general solution. Which rule are you using?
(Show each step of your calculation.)
1.
2.
3.
4.
5.
6.
7.
8.
+ 3/ +
+ 4/ +
2y = 30e 2x
3.75y = 109 cos 5x
y" - 16y = 19.2e 4 ,' + 60e x
-,,"
y"
+ 9y = cos x + 4cos 3x
+ y' - 6)' = 6x 3 - 3x 2 + 12x
y" + 4y' + 4)' = e- 2x sin 2x
y" + 6/ + 73y = 80e x cos 4x
y" + lOy' + 25y = 100 sinh 5x
y"
)'''
9. y" - 0.16y = 32 cosh O.4x
10. y"
11. y"
+
+
+
4/ + 6.25y
= 3.125(x
1.44y = 24 cos 1.2x
+
1)2
12. y"
9y = 18x + 36 sin 3x
13. y" + 4v' + 5)' = 25x 2 + 13 sin 2x
14. y" + 2y' + Y = 2x sin x
115-201
INITIAL VALUE PROBLEMS FOR
NONHOMOGENEOUS ODEs
Solve the initial value problem. State which mles you are
using. Show each step of your calculation in detail.
15. y" + 4y = 16 cos 2x, y(O) = 0, y' (0) = 0
16. y" - 3)" + 2.25)' = 27(x 2 - x).
yeO) = 20,
y' (0) = 30
17. y" + 0.2y' + 0.26)' = 1.22eo. 5x ,
y(O) = 3.5.
y' (0) = 0.35
18. y" - 2/ = 12e 2x - 8e- 2x ,
yeO) = -2,
/(0) = 12
19. y" - v' - 12\" = 144x 3 + 12.5,
~·(O) ~ 5,
. y' (0) = -0.5
20. y"
+ 2y' +
y(O) = 6.6,
lOy
=
17 sin x - 37 sin 3x,
y' (0) = -2.2
21. WRITING PROJECT. Initial Value Problem. Write
out all [he details of Example 3 in your own words.
Discuss Fig. 51 in more detail. Why is it that some of
the "half-waves" do not reach the dashed curves.
whereas others preceding them (and, of course, all later
ones) excede the dashed curves?
22. TEAM PROJECT. Extensions of the Method of
Undetermined Coefficients. (a) Extend the method
to products of the function in Table 2.1. (b) Extend
the method to Euler-Cauchy equations. Comment on
the practical significance of such extensions.
23. CAS PROJECT. Structure of Solutions of Initial
Value Problems. Using the present method. fmd, graph,
and discuss the solutions y of initial value problems of
your own choice. Explore effects on solutions caused by
84
CHAP. 2
Second-Order Linear ODEs
problem with .1'(0) = 0, y' (0) = O. Consider a problem
in which you need the Modification Rule (a) for a simple
root, (b) for a double root. Make sure that your problems
cover all three Cases I. II. III (see Sec. 2.2).
changes of initial conditions. Graph yp' y, .I' - Yp
separately, to see the separate effects. Find a problem in
which (a) the pan of y resulting from Yh decreases to zero,
(b) increases. (c) is not present in the answer y. Study a
2.8
Modeling: Forced Oscillations. Resonance
In Sec. 2.4 we considered vertical motions of a mass-spring system (vibration of a mass
on an elastic spring, as in Figs. 32 and 52) and modeled it by the homogeneolls linear
ODE
111
my" + cy'
(1)
+ ky = O.
Here yet) as a function of time t is the displacement of the body of mass 111 from rest.
These were free motions, that is, motions in the absence of extemalforces (outside forces)
caused solely by internal forces. forces within the system. These are the force of inertia
my", the damping force c/ (if c > 0). and the spring force ky acting as a restoring force.
We now extend our model by including an external force, call it ret), on the right. Then
we have
my"
(2*)
+ cy' + ky =
ret).
Mechanically this means that at each instant t the resultant of the internal forces is in
equilibrium with r(t). The resulting motion is called a forced motion with forcing
function ret), which is also known as input or driving force, and the solution yet) to be
obtained is called the output or the response of the system to the driving force.
Of special interest are periodic external forces. and we shall consider a driving force
of the form
ret)
=
Fo cos wt
(Fo
> 0,
w
> 0).
Then we have the nonhomogeneous ODE
(2)
my"
+ cy' +
ky
=
Fo cos wt.
Its solution will familiarize us with further interesting facts fundamental in engineering
mathematics, in particular with resonance.
c
Dashpot
Fig. 52.
Mass on a spring
SEC. 2.8
85
Modeling: Forced Oscillations. Resonance
Solving the Nonhomogeneous ODE (2)
From Sec. 2.7 we know that a general solution of (2) is the sum of a general solution Yh
of the homogeneous ODE (1) plus any solution yp of (2). To find yp' we use the method
of undetermined coefficients (Sec. 2.7), starting from
(3)
yp(t)
= a cos wI + b sin wI.
By differentiating this function (chain rule!) we obtain
, = -wa Sill
. wt
Yp
+
y; = -w2a cos wt -
b
w cos wt.
w2b sin wt.
Substituting Yp' y~, and y; into (2) and collecting the cosine and the sine terms, we get
[(k - 11lw2)a
+
web] cos wt
+ [ -wca
+ (k - 11lw2 )b]
sin wt
=
Fo cos wt.
The cosine terms on both sides must be equaL and the coefficient of the sine term on the
left must be zero since there is no sine term on the right. This gives the two equations
= Fo
web
(4)
-well
for determining the unknown coefficients a and b. This is a linear system. We can solve
it by elimination. To eliminate b, multiply the first equation by k - 11lW2 and the second
by - we and add the results, obtaining
Similarly. to eliminate a. multiply the first equation by we and the second by k and add to get
If the factor (k - 11lW2)2
and b,
If we set ~
=
Wo
+
w2e 2 is not zero, we can divide by this factor and solve for a
(> 0) as in Sec. 2.4, then k =
III Wo
2
and we obtain
(5)
We thus obtain the general solution of the nonhomogeneous ODE (2) in the fonn
(6)
2
11lW
yet)
=
y,,(1)
+ Yp(t).
86
CHAP. 2
Second-Order Linear ODEs
Here Yh is a general solution of the homogeneous ODE (1) and Yp is given by (3) with
coefficients (5).
We shall now discuss the behavior of the mechanical system, distinguishing between
the two cases c = 0 (no damping) and c > 0 (damping). These cases will correspond to
two basically different types of output.
Case 1. Undamped Forced Oscillations. Resonance
If the damping of the physical system is so small that its effect can be neglected over the
time interval considered, we can set c = O. Then (5) reduces to a = F o/[m(wo 2 - w 2 )]
and b = O. Hence (3) becomes (use wo 2 = kim)
(7)
*"
Here we must assume that w2 wo2; physically, the frequency wl(27T) [cycles/sec] of the
driving force is different from the natural frequency w o/(27T) of the system, which is the
frequency of the free undamped motion [see (4) in Sec. 2.4]. From (7) and from (4*) in
Sec. 2.4 we have the general solution of the "undamped system"
(8)
We see that this output is a superpositioll of two harmollic oscillations of the frequencies
just mentioned.
Resonance.
cos wt = I)
We discuss (7). We see that the maximum amplitude of Yp is (put
(9)
I
p= ------::-
where
I - (wlwO)2
a o depends on w and woo If w ~ wo, then p and a o tend to infinity. This excitation of
large oscillations by matching input and natural frequencies (w = w o) is called
resonance. p is called the resonance factor (Fig. 53), and from (9) we see that plk = aolFo
is the ratio of the amplitudes of the particular solution Yp and of the input Fo cos wt.
We shall see later in this section that resonance is of basic importance in the study of
vibrating systems.
In the case of resonance the nonhomogeneous ODE (2) becomes
(10)
Then (7) is no longer valid. and from the Modification Rule in Sec. 2.7 we conclude that
a particular solution of (10) is of the form
Yp(t)
=
tea cos wot
+
b sin Wol).
SEC. 2.8
Modeling: Forced Oscillations. Resonance
87
p
co
Fig. 53.
Resonance factor p(co)
By substituting this into (10) we find a = 0 and b = Fo/(2I1lwo). Hence (Fig. 54)
(11)
yp(t)
Fo
2mwo
.
= - - - t sm
wot.
We see that because of the factor t the amplitude of the vibration becomes larger and
larger. Practically speaking, systems with very little damping may undergo large vibrations
that can destroy the system. We shall return to this practical aspect of resonance later in
this section.
Fig. 54.
Particular solution in the case of resonance
Beats. Another interesting and highly important type of oscillation is obtained if w is
close to woo Take, for example, the particular solution [see (8)]
(12)
yet)
=
Fo
2
2 (cos wt - cos wot)
m(wo - w)
Using (12) in App. 3.1, we may write this as
yet)
=
2Fo
2
lIl(wo -
. (wo
sm
2
W
)
+
2
w)
t
sin ( Wo 2- w t) .
Since w is close to wo, the difference Wo - w is small. Hence the period of the last sine
function is large, and we obtain an oscillation of the type shown in Fig. 55, the dashed
curve resulting from the first sine factor. This is what musicians are listening to when
they tune their instruments.
CHAP. 2
88
Second-Order Linear ODEs
y
Fig. 55. Forced undamped oscillation when the difference
of the input and natural frequencies is small ("beats")
Case 2. Damped Forced Oscillations
If the damping of the mass-spring system is not negligibly small, we have e > 0 and a
damping term cy' in (1) and (2). Then the general solution y" of the homogeneous ODE
(I) approaches zero as t goes to infinity, as we know from Sec. 2.4. Practically, it is zero
after a sufficiently long time. Hence the "transient solution" (6) of (2), given by
Y = Yh + Yp' approaches the "steady-state solution" yp' This proves the following.
THEOREM 1
Steady-State Solution
After a sufficiently long time the output of a damped vibrating system under a purely
sinusoidal dril'ing force [see (2)1 will practically be a harmonic oscillation whose
.!i"eqllency is that of the input.
Amplitude of the Steady-State Solution. Practical Resonance
Whereas in the undamped case the amplitude of yp approaches infinity as w approaches
woo this will not happen in the damped case. In this case the amplitude will always be finite.
But it may have a maximum for some w depending on the damping constant c. This may
be called practical resonance. It is of great importance because if c is not too large, then
some input may excite oscillations large enough to damage or even destroy the system.
Such cases happened. in particular in earlier times when less was known about resonance.
Machines, cars, ships, airplanes, bridges, and high-rising buildings are vibrating mechanical
systems. and it is sometimes rather difficult to find constructions that are completely free
of undesired resonance effects, caused, for instance, by an engine or by strong winds.
To study the amplitude of yp as a function of w, we write (3) in the form
(13)
yp(t)
= C* cos (wt -
1]).
C* is called the amplitude of yp and 1] the phase angIe or phase lag because it measures
the lag of the output behind the input. According to (5). these quantities are
(14)
tan 1](w)
=
b
a
we
SEC. 2.8
89
Modeling: Forced Oscillations. Resonance
Let us see whether C*(w) has a maximum and, if so. find its location and then its size.
We denote the radicand in the second root in C* by R. Equating the derivative of C* to
zero, we obtain
The expression in the brackets [... J is zero if
(15)
By reshuffling terms we have
The right side of this equation becomes negative if c 2 > 2mk, so that then (15) has no
real solution and C* decreases monotone as w increases, as the lowest curve in Fig. 56
on p. 90 shows. If c is smaller, c 2 < 2mk, then (15) has a real solution w = W max , where
(15*)
From (15*) we see that this solution increases as c decreases and approaches Wo
as c approaches zero. See also Fig. 56.
The size of C*(wrnax ) is obtained from (14), with w2 = w~ax given by (15*). For this
2
w we obtain in the second radicand in (14) from (15*)
(
and
Wo
2
-
2) c.
C
--2
2
2m
The sum of the right sides of these two formulas is
Substitution into (14) gives
(16)
We see that C*(wrnax ) is always finite when c
> O. Furthermore, since the expression
in the denominator of (16) decreases monotone to zero as c 2 « 2mk) goes to zero, the
maximum amplitude (16) increases monotone to infinity, in agreement with our result in
Case 1. Figure 56 shows the amplification C*IFo (ratio of the amplitudes of output and
input) as a function of W for m = 1, k = I, hence Wo = 1, and various values of the
damping constant c.
CHAP. 2
90
Second-Order Linear ODEs
Figure 57 shows the phase angle (the lag of the outpllt behind the input), which is less
than 7r/2 when w < wo, and greater than 7r/2 for w > woo
C'
Po
'1
TC
4
3
,----c=o
c = 112
c=l
c=2
2
c= 2
OO---~--~--~--~--
o
Fir- 56. Amplification C*/Fo as a function
of w for m = 1, k = 1, and various values
of the damping constant c
11-81 STEADY-STATE SOLUTIONS
Find the steady-state oscillation of the mass-spring system
modeled by the given ODE. Show the details of your
calculations.
1. / ' + 6/ + 8)" = 130 cos 3t
2.4.'"" + 8y' + 13." = 8 sin 1.5t
3. y" + y' + 4.25y = 221 cos 4.5t
4. y" + 4/ + 5y = cos t - sin t
5. (D2 + 2D + I)y = -sin 2t
6. (D 2 + 4D + 31)y = cos r + l cos 3r
7. (D 2 + 6D + 181»), = cos 3t - 3 sin 3t
8. (D 2
+
2D
+
lOl)y = -25 sin 4t
19-141 TRANSIENT SOLUTIONS
Find the transient motion of the mass-spring system
modeled by the given ODE. (Show the details of your
work.)
y" + 2y' + 0.75.'" = 13 sin t
+ 4/ + 4)' = cos 4t
11. 4)''' + 12)"' + 9.,' = 75 sin 3t
12. (D2 + 5D + 41)), = sin 2t
9.
10. y"
13. (D 2
14. (D 2
+ 3D + 3.251))' = 13 - 39 cos 2t
+ 2D + 5l)y = I + sin t
115-201 INITIAL VALUE PROBLEMS
Find the motion of the mass-spring system modeled by
the ODE and initial conditions. Sketch or graph the
sol urian curve. In addition, sketch or graph the curve of
w
2
Fig. 57. Phase lag 1) as a function of w for
m = 1, k = I, thus Wo = 1, and various
values of the damping constant c
y - Yp to see when the system practically reaches the
steady state.
15. y" -t 2.'" + 26y = 13 cos 3t.
),(0) = 1.
y' (0) = 0.4
16. y" + 64y = cos t.
y(O) = O.
/(0) = 1
y(Q) = 0.7,
17. y" + 6y' + 8." = 4 sin 2t.
y' (0) = - 11.8
+ 2D + I)y = 75(sin t - ~ sin 21 + l sin 3t),
= O.
y'(O) = I
(4D 2 + 12D + 13l)r = 12 cos t - 6 sin t,
yeO) = 1.
y' (0) = -]
y" + 25.\' = 99 cos 4.9t, .1'(0) = 2, y' (0) = 0
18. (D2
y(O)
19.
20.
21. (Beats) Derive the formula after (12) from (12). Can
there be beats if the system has damping?
22. (Beats) How does the graph ofthe solution in Prob. 20
change if you change (a) yeO). (b) the frequency of the
driving force?
23. WRITING PROJECT. Free and Forced Vibrations.
Write a condensed report of 2-3 pages on the most
important facts about free and forced vibrations.
24. CAS EXPERIMENT. Undamped Vibrations.
(a) Solve the initial value problem y" + Y = cos wt,
w 2 *- I, yeO) = o. y' (0) = O. Show that the solution
can be written
y(t) =
(17)
- -2
-2
I-w
sin [ -1 (1
2
+
sin
w)t ] X
[~(l
- W)t].
SEC. 2.9
91
Modeling: Electric Circuits
(b) Experiment with (17) by changing w to see the
change of the curves from those for small w (> 0) to
beats, to resonance and to large values of w (see Fig. 58).
m=0.2
20n
25. TEAM PROJECT. Practical Resonance. (a) Give
a detailed derivation of the crucial formula (16).
(b) By considering dC*/dc show that C*( w max)
increases as c (~ V2111k) decreases.
(c) lIIustrate practical resonance with an ODE of your
own in which you vary c. and sketch or graph
corresponding curves as in Fig. 56.
(d) Take your ODE with c fixed and an input of two
terms, one with frequency close to the practical
resonance frequency and the other not. Discuss and
sketch or graph the output.
(e) Give other applications (not in the book) in which
resonance is important.
26. (Gun barrel) Solve
y" + Y
ifO~t~7T
I
=
{
o
ift>7T,
.1'(0) =
y' (0)
= O.
-10
m= 0.9
I
0.04
I~
I~
~11~l
'IOn
I
-0.04
This models an undamped system on which a force F
acts during some interval of time (see Fig. 59), for
instance, the force on a gun banel when a shell is fired,
the barrel being braked by heavy springs (and then
damped by a dashpot, which we disregard for
simplicity). Hint. At 7T both y and yf must be continuous.
F
m=1
~
k=1
~
n
m=6
Fig. 58.
Typical solution curves in CAS Experiment 24
Fig. 59.
Problem 26
2.9 Modeling: Electric Circuits
Designing good models is a task the computer cannot do. Hence setting up models has
become an important task in modern applied mathematics. The best way to gain experience
is to consider models from various fields. Accordingly, modeling electric circuits to be
discussed will be profitable for all students, not just for electrical engineers and computer
scientists.
We have just seen that linear ODEs have important applications in mechanics (see also
Sec. 2.4). Similarly, they are models of electric circuits, as they occur as portions of large
networks in computers and elsewhere. The circuits we shall consider here are basic
building blocks of such networks. They contain three kinds of components, namely,
resistors, inductors, and capacitors. Figure 60 on p. 92 shows such an RLC-circuit, as
they are called. In it a resistor of resistance R n (ohms), an inductor of inductance L H
(henrys), and a capacitor of capacitance C F (farads) are wired in series as shown, and
connected to an electromotive force E(t) V (volts) (a generator, for instance), sinusoidal
as in Fig. 60, or of some other kind. R, L, C, and E are given and we want to find the
current I(t) A (amperes) in the circuit.
92
CHAP. 2
Second-Order Linear ODEs
Fig. 60.
RLC-circuit
An ODE for the cunent I(t) in the RLC-circuit in Fig. 60 is obtained from the following
law (which is the analog of Newton's second law, as we shall see later).
Kirchhoff's Voltage Law (KVL).7 The l'oltage (the electromotive force) impressed all
a closed loop is equal to the Slllll of the voltage drops across the other elements of tlze
loop.
In Fig. 60 the circuit is a closed loop. and the impressed voltage E(t) equals the sum
of the voltage drops across the three elements R, L, C of the loop.
Voltage Drops. Experiments show that a current 1 flowing through a resistor. inductor
or capacitor causes a voltage drop (voltage difference, measured in volts) at the two ends;
these drops are
(Ohm's law) Voltage drop for a resistor of resistance R ohms (D),
RI
dI
LJ' = L - Voltage drop for an inductor of inductance L henrys (H),
dt
Q
Voltage drop for a capacitor of capacitance C farads (F).
C
Here Q coulombs is the charge on the capacitor, related to the current by
I(t)
=
dQ
dt '
equivalently,
Q(t)
=
fI(t) dr.
This is summarized in Fig. 61.
According to KVL we thus have in Fig. 60 for an RLC-circuit with electromotive force
E(t) = Eo sin wt (Eo constant) as a model the "integro-differential equation"
0')
I
LJ, + RI + C
f
I lit = E(t) = Eo sin wt.
7GUSTAV ROBERT KIRCHHOFF (1824--1887). German physicist. Later we shall also need Kirchholrs
current law (KCL):
At allY poillf of a circlIit, the Slim of the illf/owillg currents is equal to the slim of the outflowil1g currents.
The units of measurement of electrical quantities are named after ANDRE MARIE AMPERE (\ 775-1836),
French physicist. CHARLES AUGUSTIN DE COULOMB (1736-1806), French physicist and engineer,
MICHAEL FARADAY (1791-1867), Engli~h physicist, JOSEPH HENRY (I 797-1 878}, American physicist.
GEORG SIMON OHM (1789-1854), Gemmn physicist, and ALESSANDRO VOLTA (1745-1827), Italian
physicist.
SEC. 2.9
Modeling: Electric Circuits
93
-
Notation
Symbol
Ohm's resistor
---ANVV-
R
Ohm's resistance
ohmsC!l)
RI
Inductor
...ro0OO"L
L
Inductance
henrys (H)
Capacitor
---11-
C
Capacitance
farads CF)
L dl
dt
Q/C
Fig. 61.
Unit
Voltage Drop
Name
Elements in an RLC-circuit
To get rid of the integral, we differentiate (1') with respect to t, obtaining
(1)
LI"
+ RI' + ~ I =
C
E' (t) = Eow cos wt.
This shows that the current in an RLC-circuit is obtained as the solution of this
nonhomogeneous second-order ODE (1) with constant coefficients.
From (l '), using I = Q', hence l' = Q", we also have directly
(1")
LQ"
+ RQ , + -I
C
Q = Eo sin wt.
But in most practical problems the current I(t) is more important than the charge Q(t),
and for this reason we shall concentrate on (1) rather than on (1").
Solving the ODE (1) for the Current.
Discussion of Solution
A general solution of (l) is the sum I = Ih + Ip, where Ih is a general solution of the
homogeneous ODE corresponding to (1) and lp is a particular solution of (1). We first
determine Ip by the method of undetermined coefficients. proceeding as in the previous
section. We substitute
(2)
+
Ip
= a cos
I~
= w( -a sin wt + b cos wt)
wt
b sin wt
I; = w 2 ( -a cos wt - b sin wt)
into (I). Then we collect the cosine terms and equate them to Eow cos wt on the right.
and we equate the sine terms to zero because there is no sine term on the right.
Lw2( -a)
+ Rwb + alC =
Eow
(Cosine tenus)
LW2(-b)
+ Rw(-a) +
=0
(Sine terms).
blC
To solve this system for a and b, we first introduce a combination of Land C, called the
reactance
(3)
s=
wL -
1
wC
94
CHAP. 2
Second-Order Linear ODEs
Dividing the previous two equations by w, ordering them, and substituting S gives
-Sa
+ Rb =
Eo
= O.
-Ra - Sb
We now eliminate b by multiplying the first equation by S and the second by R, and
adding. Then we eliminate a by mUltiplying the first equation by R and the second by
-So and adding. This gives
In any practical case the resistance R is different from zero. so that we can solve for a
and b,
(4)
Equation (2) with coefficients a and b given by (4) is the desired pm1icular solution Ip of
the nonhomogeneous ODE (I) governing the current I in an RLC-circuit with sinusoidal
electromotive force.
Using (4). we can write Ip in terms of "physically visible" quantities. namely. amplitude
10 and phase lag () of the cun'ent behind the electromotive force, that is.
(5)
where [see (4) in App. A3.1]
tan () =
a
S
b
R
V
The quantity R2 + 52 is called the impedance. Our formula shows that the impedance
equals the ratio Eollo. This is somewhat analogous to Ell = R (Ohm's law).
A general solution of the homogeneous equation con'esponding to (I) is
where Al and A2 are the roots of the characteristic equation
A2
R
+-
L
A+
-
We can write these roots in the form Al = -a
a=
1
LC
= O.
+ {3 and A2 = -a -
{3, where
R
2L '
Now in an actual circuit, R is never zero (hence R > 0). From this it follows that Ih
approaches zero, theoretically as t ~ x, but practically after a relatively short time. (This
is as for the motion in the previous section.) Hence the transient current 1= Ih + Ip tends
SEC. 2.9
95
Modeling: Electric Circuits
to the steady-state current [p, and after some time the output will practically be a harmonic
oscillation, which is given by (5) and whose frequency is that of the input (of the
electromotive force).
E X AMP L E 1
RLC-Circuit
Find the cunent l(f) in an RLC-circuit with R = II n (ohms), L = 0.1 H (henry), C = 1O-2 F (farad), which
is connected to a source of voltage E(f) = 100 sin 400f (hence 63~ HL = 63~ cycles/sec, because
400 = 63~' 21T). A,sume that cunent and charge are zero when f = O.
Solution.
Step I. General solution of the homogeneous ODE. Substituting R, L, C, and the derivative E' (f)
into (I), we obtain
+
0.11"
Ill'
+
JOOI = 100' 400 cos 400f.
Hence the homogeneous ODE is 0.11" + III' + 1001
0.1,1.2
The roob are ,1.1 =
+
11,1.
=
+
O. Its characteristic equation is
100 = O.
·10 and ,1.2 = -100. The corresponding general solution of the homogeneous ODE is
Step 2. Particular solution Ip of (1). We calculate the reactance S = 40 - 114 = 39.75 and the steady-state
current
11'(1) = a
cos 400f
~
/J sin 4001
with coefficients obtained from (4)
-100· 39.75
]]2 + 39.752 = -2.3368,
[[=
b=
II
100·11
2
2 = 0.6467.
+ 39.75
Hence in our present case, a general solution of the nonhomogeneous ODE (1) is
+
1(1) = C1e-lOt
(6)
C2e-lOOt -
2.3368 cos 400f
+ 0.6467 sin 4(Xlf.
Step 3. Particular solution satisfying the initial conditions. How to use Q(O) = O? We finally detennine
and C2 from the initial conditions 1(0) = 0 and Q(O) = O. From the first condition and (6) we have
(7)
1(0) =
("1
Furthermore, using (1 ') with
obtain
+
L1'(0)
Differentiating (6) and setting
l' (0)
= -
IOc1 - 100c2
/(t) =
+ R'O +
f = 0,
I
C'O
c1
=
hence
0,
1'(0) =
(1'»,
we
o.
we thus obtain
+ 0 + 0.6467' 4(Xl
The solution of this and (7) is
hence
c2 - 2.3368 = 0,
= 0 and noting that the integral equals QU) (see the formula before
f
C1
=
0,
hence
- Hlc 1 = HXJ(2.3368 -
C1) -
258.68.
= -0.2776, C2 = 2.6144. Hence the answer is
-0.:!.776e-
lOt
+ 2.61..J4e- lOOt
-
2.3368 cos 400f
+ 0.6467 sin 400f.
Figure 62 on p. 96 shows I(t) as well as 11'(1), which practically coincide, except for a very short time near
f = 0 because the exponential terms go to zero very rapidly. Thus after a velY short time the current will
practically execute harmonic oscillations of the input frequency 63~ Hz = 63~ cycles/sec. Its maximum amplitude
and phase lag can be ,een from (5), which here takes the form
11'(1) =
2.4246 sin (400f - 1.3(08).
•
96
CHAP. 2
Second-Order Linear ODEs
y
3
,
A
.
,,
,, ,
, ,
,,, ,,,
,,
\
\
\
\
2
,
o
"
-1
-2
-3
Fig. 62.
Transient and steady-state currents in Example 1
Analogy of Electrical and Mechanical Quantities
Entirely different physical or other systems may have the same mathematical model.
For instance, we have seen this from the various applications of the ODE y' = Icy in
Chap. I. Another impressive demonstration of this unifyi1lg power of mathematics is
given by the ODE (I) for an electric RLC-circuit and the ODE (2) in the last section for
a mass-spring system. Both equations
L/"
I
+ RI' + C
I = EOW cos wt
and
my"
+ cy' +
ky
= Fo cos wt
are of the same form. Table 2.2 shows the analogy between the various quantities involved.
The inductance L corresponds to the mass 111 and, indeed, an inductor opposes a change
in current, having an "inertia effect" similar to that of a mass. The resistance R corresponds
to the damping constant c, and a resistor causes loss of energy, just as a damping dashpot
does. And so on.
This analogy is strictly quantitative in the sense that to a given mechanical system we
can construct an electric circuit whose current will give the exact values of the displacement
in the mechanical system when suitable scale factors are introduced.
The practical impOltallce of this analogy is almost obvious. The analogy may be used
for constructing an "'electrical moder· of a given mechanical model, resulting in substantial
savings of time and money because electric circuits are easy to assemble, and electric
quantities can be measured much more quickly and accurately than mechanical ones.
Table 2.2
Analogy of Electrical and Mechanical Quantities
Electncal System
Inductance L
Resistance R
Reciprocal lIC of capacitance
Derivative Eow cos wt of }
electromotive force
Current I(t)
Mechanical System
Mass 111
Damping constant c
Spring modulus k
Driving force Fo cos wt
Displacement yet)
SEC. 2.9
97
Modeling: Electric Circuits
-.
1. (RL-circuit) Model the RL-circuit in Fig. 63. Find the
general solution when R. L. E are any constants. Graph
or sketch solutions when L = 0.1 H. R = 5 D.
E = 12V.
Current l(t)
c
2. (RL-circuit) Solve Pmb. 1 when E = Eo sin wt and R,
L, Eo, ware arbitrary. Sketch a typical solution.
3. (RC-circuit) Model the RC-circuit in Fig. 66. Find the
current due to a constant E.
4. (RC-circuit) Find the current in the RC-circuit in
Fig. 66 with E = Eo sin wt and arbitrary R. C. Eo> and w.
Fig. 67.
5. (LC-circuit) This is an RLC-circuit with negligibly
small R (analog of an undamped mass-spring system).
Find the current when L = 0.2 H. C = 0.05 F, and
E = sin r V, assuming zero initial current and charge.
6. (LC-circuit) Find the current when L = 0.5 H.
C = 8 . 10-4 F, E = [2 V and initial current and charge
zero.
17-91
RL-circuit
Current Btl
5
4
*
3
2t---~~~====----1 '
o
0.02
0.04
Fig. 64.
0.06
0.08
RLC-CIRCUITS (FIG. 60, P. 92)
7. (Tuning) In runing a stereo system to a radio station,
we adjust the tuning control (tum a knob) that changes
C (or perhaps L) in an RLC-circuit so that the amplitude
of the steady-state current (5) becomes maximum. For
what C will this happen?
8. (Transient current) Prove the claim in the text that if
R
0 (hence R > 0). then the tr~msient cun'ent
appmaches Ip as r -'; x.
L
Fig. 63.
Current 1 in Problem 3
0.1
Currents in Problem 1
9. (Cases of damping) What are the conditions for an
RLC-circuit to be (I) overdamped. (Il) critically
damped. (III) underdamped? What is the critical
resistance Rcrit (the analog of the critical damping
constant 2v;;;i)?
110-121
Find the steady-state current in the RLC-circuit
in Fig. 60 on p. 92 for the given data. (Show the details of
your work.)
0.5 H. C = 0.1 F. E = 100 sin 2t V
0.25 H, C = 5' 10-5 F, E = 1I0 V
12. R = 2 D, L = I H, C = 0.05 F, E = 1~7 sin 3r V
lO. R
11. R
121t
=
=
8 D, L
1 D, L
=
=
@-g; I
Fig. 65.
Typical current I = e- o.lt
in Problem 2
+
sin (t - ~7T)
Find the transient current (a general solution)
in the RLC-circuit in Fig. 60 for the given data. (Show the
details of your work.)
13. R = 6 D, L = 0.2 H. C = 0.025 F. E = 110 sin lOr V
R
14. R = 0.2 D, L = 0.1 H, C = 2 F. E = 754 sin 0.51 V
15. R = 1110 D, L = 112 H. C = 100113 F,
E = e- 4t (1.932 cos ~r + 0.246 sin ~r) V
I~~
c
Fig. 66.
RC-circuit
Solve the initial l'alue problem for the
RLC-circuit in Fig. 60 with the given data. assuming zero
initial curren! and charge. Graph or sketch the solution.
(Sho\\ the details of your worL)
98
CHAP. 2
16. R =
17. R =
E =
18. R =
E =
Second-Order Linear ODEs
4 fl, L = 0.1 H, C = 0.025 F, E = 10 sin lOt V
(b) The complex impedance Z is defined by
6 fl, L = 1 H. C = 0.04 F.
600 (cos t + 4 sin t) V
Z = R
3.6 n. L = 0.2 H, C = 0.0625 F,
164 cos lOt V
+ is
= R
+ ;(WL -
~c).
Show that K obtained in (a) can be written as
19. WRITING PROJECT. Analogy of RLe-Circuits and
Damped Mass-Spring Systems. (a) Write an essay of
2-3 pages based on Table 2.2. Describe the analogy in
more detail and indicate its practical significance.
(b) What RLC-circuit with L = I H is the analog of
the mass-spring system with mass 5 kg, damping
constant 10 kg/sec, spring constant 60 kglsec2 , and
driving force 220 cos lOt?
(c) Illustrate the analogy with another example of your
own choice.
20. TEAM PROJECT. Complex Method for Particular
Solutions. (a) Find a particular solution of the complex
ODE
K=
Eo
iZ
Note that the real part of Z is R. the imaginary part is
the reactance S, and the absolute value is the impedance
Izi
=
V R2 + S2 as defined before. See Fig. 68.
(c) Find the steady-state solutLm of the ODE
[" + 2/' + 31 = 20 cos t, first by the real method and
then by the complex method, and compare. (Show the
details of your work.)
(d) Apply the complex method to an RLC-circuit of
your choice.
(8)
by substituting Ip = Ke'wt (K unknown) and its
derivatives into (8), and then take the real part Ip of Ip.
showing that Ip agrees with (2), (4). Hint. Use the Euler
formula e iwt = cos wt + ; sin wt [(11) in Sec. 2.2 with
wt instead of tl Note that Eow cos wt in (1) is the real
part of Eowe'wt in (8). Use ;2 = -1.
2.10
Solution
R
Fig. 68.
Real axis
Complex impedance Z
by Variation of Parameters
We continue our discussion of nonhomogeneous linear ODEs
(1)
y"
+ p(x»),' +
q(x)y
=
rex).
In Sec. 2.6 we have seen that a general solution of (1) is the sum of a general solution )'1,
of the corresponding homogeneous ODE and any particular solution yp of (1). To obtain),p
when r(x) is not too complicated, we can often use the method of 1I1ldetenllined coefficie11fs.
as we have shown in Sec. 2.7 and applied to basic engineering models in Secs. 2.8 and 2.9.
However, since this method is restricted to functions r(x) whose derivatives are of a form
similar to r(x) itself (powers. exponential functions. etc.). it is desirable to have a method valid
for more general ODEs (I)' which we shall now develop. It is called the method of variation
of parameters and is credited to Lagrange (Sec. 2.1). Here p, q. r in (1) may be v31iable
(given functions of x). but we assume that they are continuous on some open interval I.
Lagrange's method gives a particular solution Yp of (1) on I in the form
(2)
SEC. 2.10
99
Solution by Variation of Parameters
where
)'1>
.\'2 form a basis of solutions of the corresponding homogeneous ODE
.\''' + p(x)y' + q(x)y
(3)
0
=
on I. and W is the Wronskian of YI • .\'2.
(4)
(see Sec. 2.6).
CAUTION! The solution formula (2) is obtained under the assumption that the ODE
is written in standard form. with y" as the flfst term as shown in 0). If it starts with f(x)y".
divide first by f(x).
The integration in (2) may often cause difficulties, and so may the determination of YI .
.\'2 if (I) has variable coefficients. If you have a choice. use the previous method. It is
simpler. Before deriving (2) let us work an example for which you do need the new
method. (Try otherwise.)
E X AMP L E 1
Method of Variation of Parameters
Solve the nonhomogeneous ODE
V"
+v
=
sec x =
cos x
Solulion.
A basis of solutions of the homogeneous ODE on any interval is .1'1
gives the Wronskian
lIl(YI' .1'2) = cos x cos
= cos x.
1'2
= sin x. This
sin x (-sin x) = I.
T -
From (2). choosing zero constants of integration. we get the particular solution of the given ODE
Yp
=
-cos xfSin x sec x dx + sin xfcos x sec x dr
(Fig. 69).
= cos x In Icos xl
+ x sin x
Figure 69 shows Yp and its first term, which is small, so that x sin x essentially determines the shape of the curve
of )'p. (Recall from Sec. 2.8 that we have seen x sin x in connection with resonance. except for notation.) From
yp and the general solution Yh = eIYI + C2.1'2 of the homogeneous ODE we obtain the answer
.I'
= Yh + Yp =
(el
+ In Icosxl) cosx + (e2 + x) sinx.
Had we included integration constants -Cl' c2 in (2), then (2) would have given the additional
+ e2 sin x = eLYI + e2Y2, that is, a general solution of the given ODE directly from (2). This will
always be the case.
•
ci cos x
y
10
5
0
-5
-10
Fig. 69.
D
V
I
2
8
10
12
\J
x
Particular solution yp and its first term in Example 1
100
CHAP. 2
Second-Order Linear ODEs
Idea of the Method.
Derivation of (2)
What idea did Lagrange have? What gave the method the name? Where do we use the
continuity assumptions?
The idea is to start from a general solution
of the homogeneous ODE (3) on an open interval I and to replace the constants ("the
parameters") Cl and C2 by functions u(x) and vex): this suggests the name of the method.
We shall determine u and v so that the resulting function
(5)
is a particular solution of the nonhomogeneous ODE (1). Note that Yh exists by Theorem
3 in Sec. 2.6 because of the continuity of p and q on l. (The continuity of T will be used
later.)
We determine u and v by substituting (5) and its derivatives into (1). Differentiating
(5). we obtain
v' = u'v.J + uv'
.1 + v\
.2 + vv'.
.2
_p
Now yp must satisfy (I). This is one condition for lYvo functions u and v. It seems plausible
that we may impose a second condition. Indeed, our calculation will show that we can
detelmine u and v such that Yp satisfies (1) and u and v satisfy as a second condition the
equation
(6)
This reduces the first derivative
Y; to the simpler form
,
(7)
Yp =
,
UYI
,
+ VY2'
Differentiating (7), we obtain
"
(8)
Yp
= u "YJ
+ UYII, + V "Y2 + VY2'"
We now substitute yp and its derivatives according to (5), (7), (8) into (1). Collecting
terms in u and terms in v, we obtain
Since Yl and
)'2
(9a)
Equation (6) is
(9b)
are solutions of the homogeneous ODE (3), thIs reduces to
u'y{
+ v'y~
= T.
SEC. 2.10
Solution by Variation of Parameters
101
This is a linear system of two algebraic equations for the unknown functions u' and v'
We can solve it by elimination as follows (or by Cramer's rule in Sec. 7.6). To eliminate
v', we multiply (9a) by -Y2 and (9b) by y~ and add, obtaining
thus
Here, W is the Wronskian (4) of Yl' Y2' To eliminate u' we multiply (9a) by Yl' and (9b)
by -)'~ and add. obtaining
thus
Since h, )'2 form a basis, we have W,* 0 (by Theorem 2 in Sec. 2.6) and can divide by W,
u
(10)
,
v'
By integration,
u= -
I
yr
.~ dx,
v =
I Wdx.
"Ir
These integrals exist because rex) is continuous. Inserting them into (5) gives (2) and
completes the derivation.
•
11-171
-
..
--
lA _ _
.
••
....
GENERAL SOLUTION
Solve the given nonhomogeneous ODE by variation of
parameters or undetennined coefficients. Give a geneml
solution. (Show the details of your work.)
1. Y" + Y = csc x
2. y" - 4y' + 4)'
3. x
2
)''' -
2x),' + 2y = x 3 cos x
4. ,," - 2y' + Y =
5. y" + Y = tan x
6. x 2)''' - xy'
7. yo"
+Y
x 2ex
=
=
+Y
cos x
eX
sin x
In
sec x
= x
+
Ixl
8. y" - 4/ + 4-," = 12e 2xlx 4
9. (D2 - 2D + l)y = x 2 + x- 2e x
10. (D 2
-
l)y = I1cosh x
11. (D 2 + 4l)y = cosh 2x
12. (x 2D2 + xD - al)y = 3x- 1 + 3x
13. (x 2D2 - 2xD + 2l)y = x 3 sin x
14. (x 2D2 + xD - 4l)y = 11.>:2
15. (D 2 + l)y = sec x - 10 sin 5x
16. (x 2D2 + xD + (x 2 - a)l)y = X 3/2 cos x.
Hint. To find)'1> )'2 set Y = UX- 1I2 .
17. (x 2D2 + xD + (x 2 - !)l)y = x 3/2 sin x.
Him: As in Prob. 16.
18. TEAM PROJECT. Comparison of Methods. The
undetermined-coefficient method should be used
whenever possible because it is simpler. Compare it
with the present method as follows.
(a) Solve y" + 2/ - 15)' = 17 sin 5x by both
methods, showing all details, and compare.
(h) Solve y" + 9)" = r 1 + /2, rl = sec 3x,
r2 = sin 3x by applying each method to a suitable
function on the right.
(e) Invent an undetermined-coefficient method for
by
nonhomogeneous
Euler-Cauchy
equations
experimenting.
102
CHAP. 2
Second-Order Linear ODEs
:a.'
•
II-$.
1. What general properties make linear ODEs particularly
attractive?
2. What is a general solution of a linear ODE? A basis of
solutions?
3. How would you obtain a general solution of a
nonhomogeneous linear ODE if you knew a general
solution of the corresponding homogeneous ODE?
4. What does an initial value problem for a second-order
ODE look like?
5. What is a paI1icuiar solution and why is it more common
than a general solution as the answer to practical
problems?
6. Why are second-order ODEs more important in
modeling than ODEs of higher order?
7. Describe the applications of ODEs in mechanical
vibrating systems. What are the elecuical analogs of
those systems?
8. If a construction, such as a bridge, shows undesirable
resonance. what could you do?
19-181
GENERAL SOLUTION
Find a general solution. Indicate the method you are using
and sho" the details of your calculation.
9. y"
2/
Sy = 52 cos 6x
10. y" + 6/ + 9y = e- 3x - 27x 2
11. y" + S/ + 25y = 26 sin 3x
12. yy" = 2/ 2
13. (x 2D2 + 2xD - 121)y = Jfx 3
14. (x 2D2 + 61:D + 6/).,· = x 2
15. (D 2 - 2D + I)y = x- 3 e x
16. (D 2 - 4D + 5l)y = e 2x csc x
17. (D 2 - 2D + 21)y = e 7 esc x
18. (4x 2D2 - 24xD + 49/)y = 36x 5
119-251
TIONS AND PROBLEMS
126--341
APPLICATIONS
26. Find the steady-state, solution of the system in Fig. 70
when 111 = 4. c = 4. k = 17 and the driving force is
102 cos 3f.
27. Find the motion of the system in Fig. 70 with mass
0.25 kg. no damping. spring constant I kg/sec2 , and
driving force 15 cos O.5f - 7 sin l.5f nt. assuming zero
initial displacement and velocity. For what frequency
of the driving force would you get resonance?
28. In Prob. 26 find the solution corresponding to initial
displacement 10 and initial velocity O.
29. Show that the system in Fig. 70 with
111 = 4, c = O.
k = 36, and driving force 61 cos 3.1 T exhibits beats.
Him: Choose zero initial conditions.
30. In Fig. 70 let 111 = 2. c = 6, k = 27, and
r(t) = 10 cos wf. For what w will you obtain the steadystate vibration of maximum possible amplitude?
Determine this amplitude. Then use this wand the
undetelmined-coefficient method to see whether you
obtain the same amplitude.
31. Find an electrical analog of the mass-spring system in
Fig. 70 with mass 0.5 kg, spling constant 40 kg/sec2 ,
damping constant 9 kg/sec. and driving force
102 cos 6f nL Solve the analog. assuming Lew initial
current and charge.
32. Find the current in the RLC-circuit in Fig. 71
when L = 0.1 H. R = 20 n, C = 2· 10-4 F. and
E(t) = 11 0 sin 415t V ( 66 cycles/sec).
33. Find the current in the RLC-circuit when L = 0.4 H,
R = 40 n. C = 10- 4 F, and E(t) = 220 sin 314t V
(50 cycles/sec).
34. Find a pat1icular solution in Prob. 33 by the complex
method. (See Team Project 20 in Sec. 2.9.)
INITIAL VALUE PROBLEMS
Solve the following initial value problems. Sketch or graph
the solution. (Show the details of your wOIk)
19. y" + 5y'
14y = 0, yeO) = 6, y' (0) = -·6
20. y" + 6y' + ISy = 0, yeO) = 5. /(0) = -21
21. x 2 y" - xy' - 24y = 0, y(l) = 15, y'(1) = 0
22. x 2 y" + 15x/ + .J.9\" = o. yO) = 2. y'(1) = -II
23. y" + 5/ + 6y = IOSx 2 , yeO) = IS, y' (0) = -26
24. -,," + y' + 2.5y = 13 cos x, yeO) = S.O,
y' (0) = 4.5
25. (x 2D2 + xD - 4l)y = x 3 , yO) = -4/5,
y' (I) = 93/5
E(t)
Fig. 70.
Mass-spring
system
Fig. 71.
RLC-circuit
103
Summary of Chapter 2
Second-Order Linear ODEs
Second-order linear ODEs are particularly important in applications. for instance.
in mechanics (Sees. 2A. 2.8) and electrical engineering (Sec. 2.9). A second-order
ODE is called linear if it can be written
(1)
+
y"
p(x)y'
+
=
q(x)y
rex)
(Sec. 2.1).
(If the first term is, say. f(x)y", divide by f(x) to get the "standard form" (1) with
-,," as the first term.) Equation (1) is called homogeneous if r(x) is zero for all x
considered, usually in some open interval; this is written rex) "" O. Then
(2)
.v"
+ p(x»)" +
q(x)y
= O.
Equation (I) is called nonhomogeneous if rex) =1= 0 (meaning rex) is not zero for
some x considered).
For the homogeneous ODE (2) we have the imp0l1ant superposition principle
(Sec. 2.1) that a linear combination y = kY1 + 1)'2 of two solutions .'"1, Y2 is again
a solution.
Two linearly independent solutions ."1,)'2 of (2) on an open interval I form a basis
(or fundamental system) of solutions on I. and y = C1Y1 + C2Y2 with arbitrary
constants (\, C2 is a general solution of (2) on I. From it we obtain a particular
solution if we specify numeric values (numbers) for C1 and C2. usually by prescribing
two initial conditions
(xo. Ko. K1 given numbers; Sec. 2.1).
(3)
(2) and (3) together fonn an initial value problem. Similarly for (I) and (3).
For a nonhomogeneous ODE (1) a general solution is of the form
(4)
.v
= •vh
(Sec. 2.7).
+". p
Here Yh is a general ~olution of (2) and Yp is a particular solution of (I). Such a yp
can be determined by a general method (variation of parameters, Sec. 2.10) or in
many practical cases by the method of undetermined coefficients. The latter applies
when (I) has constant coefficients p and q, and r{x) is a power of x. sine, cosine,
etc. (Sec. 2.7). Then we write (1) as
(5)
y"
+
a/
+
The corresponding homogeneous ODE y'
where ,\. is a root of
(6)
,\.2
by
=
rex)
+ ay' +
+ a'\' + b = O.
by
(Sec. 2.7).
= 0 has solutions y = eAX•
104
CHAP. 2
Second-Order Linear ODEs
Hence there are three cases (Sec. 2.2):
Case
Type ot Roots
-
I
11
III
General Solution
-Y = cleA1X + C2 eA2X
Y = (Cl + c 2 x)e- ax/ 2
y = e- ux/ 2 (A eos w*x + B sin w*x)
--
-
Distinct real Ab A2
1
Double -"2a
Complex -~(/ ± iw*
Important applications of (5) in mechanical and electIical engineering in connection
with vibrations and resonance are discussed in Secs. 2.4. 2.7. and 2.8.
Another large class of ODEs solvable "algebraically" consists of the
Euler-Cauchy equations
(Sec. 2.5).
(7)
These have solutions of the form y
equation
(~)
1112
+
=
(a -
x1n, where
l)m
111
+b=
is a solution of the auxiliary
O.
Existence and uniqueness of solutions of (1) and (2) is discussed in Sees. 2.6
and 2.7, and reduction of order in Sec. 2.1.
••••
CHAPTER
3
Higher Order Linear ODEs
In this chapter we extend the concepts and methods of Chap. 2 for linear ODEs from order
n = 2 to arbitrary order n. This will be straightforward and needs no new ideas. However,
the formulas become more involved, the variety of roots of the characteristic equation (in
Sec. 3.2) becomes much larger with increasing n, and the Wronskian plays a more
prominent role.
Prerequisite: Secs. 2.1, 2.2. 2.6. 2.7, 2.10.
References Gnd Answers to Problems: App. I Pm1 A. and App. 2.
3.1
Homogeneous Linear ODEs
Recall from Sec. 1.1 that an ODE is of Ilth order if the nth derivative yen> = dnyldx rt of
the unknown function y(x) is the highest occurring delivative. Thus the ODE is of the form
F(x, y,
y', ... . /n» =
0
where lower order derivatives and y itself mayor may not occur. Such an ODE is called
linear if it can be written
(1)
In>
+ Pn_l(X)y<n-D + ... + Pl(X)y' + Po(x)y =
rex).
(For n = 2 this is 0) in Sec. 2.1 with 17] = P and Po = q). The coefficients Po, ... , Pn-l
and the function r on the right are any given functions of x, and y is unknown./n > has
coefficient I. This is practical. We call this the standard form. (If you have Pn(x)/n>,
divide by Pn(x) to get this form.) An nth-order ODE that cannot be written in the form
(1) is called nonlinear.
If r(x) is identically zero, r(x) == 0 (zero for all x considered. usually in some open
interval I). then (1) becomes
(2)
In)
+ Pn_l(X)y<n-D + ...
+ Pl(X)y'
+ Po(x)y = 0
and is called homogeneous. If rex) is not identically zero. then the ODE is called
nonhomogeneous. This is as in Sec. 2.1.
A solution of an nth-order (linear or nonlinear) ODE on some open interval/is a
function), = hex) that is defined and 11 times differentiable on I and is such that the ODE
becomes an identity if we replace the unknown function y and its derivatives by h and its
corresponding derivatives.
105
106
CHAP_ 3
Higher Order Linear ODEs
Homogeneous Linear ODE: Superposition Principle,
General Solution
Sections 3_1-3_2 will be devoted to homogeneous linear ODEs and Sec. 3.3 to
nonhomogeneous linear ODEs_ The basic superposition or linearity principle in Sec_ 2_1
extends to nth order homogeneous linear ODEs as follows_
THEOREM 1
Fundamental Theorem for the Homogeneous Linear ODE (2)
For a homogeneous linear ODE (2), sums and constant multiples of solutions on
some open i1lferval 1 are again solutions 011 l. (This does not hold for a
nonhomogeneous or nonlinear ODE!)
The proof is a simple generalization of that in Sec. 2_1 and we leave it to the student_
Our further discussion parallels and extends that for second-order ODEs in Sec. 2_1_
So we define next a general solution of (2), which will require an extension of linear
independence from 2 to n functions_
DEFINITION
General Solution, Basis, Particular Solution
A general solution of (2) on an open intervall is a solution of (2) on 1 of the form
(3)
(CI, - - - , C n
arbitrary)
where Y1- - - - , Yn is a basis (or fundamental system) of solutions of (2) on l: that
is, these solutions are linearly independent on l, as defined below_
A particular solution of (2) on 1 is obtained if we assign specific values to the
n constants CI' - - - , cn in (3)_
DEFINITION
Linear Independence and Dependence
functions YI(X), - - - • Yn(x) are called linearly independent on some interval 1
where they are defined if the equation
11
(4)
on 1
implies that all kI , - - - , kn are zero_ These functions are called linearly dependent
on I if this equation also holds on I for some kb - - - , k n not all zero_
(As in Secs_ 1.1 and 2_1, the arbitrary constants
CI, ••• , Cn
must sometimes be restricted
to some interval.)
If and only if .\'1, - - - , Yn are linearly dependent on I, we can express (at least) one of
these functions on I as a "linear combination" of the other n - I functions. that is, as
a sum of those functions. each multiplied by a constant (zero or not)_ This motivates the
term "linearly dependent." For instance, if (4) holds with k1 -=1= 0, we can divide by ki and
express YI as the linear combination
SEC. 3.1
107
Homogeneous Linear ODEs
Note that when n = 2, these concepts reduce to those defined in Sec. 2.1.
E X AMP L E 1
Linear Dependence
Show that the functions .1'1
Solution.
E X AMP L E 2
•
Y2 = 0.1'1 + 2.5)'3· This proves linear dependence on any interval.
Linear Independence
= ..2'.\"3 = x 3 are linearly independent on any interval. for instance, on -I ::'" x ::'" 2.
Show that.\"1
=
Solution.
Equation (4) is
k2
E X AMP L E 3
= x 2 , Y2 = 5x, .1'3 = 2x are linearly dependent on any interval.
X, .1'2
k 1x
+
k2X2
+
k3X3
= O. Taking (aL~ = -1. (b) x = I. (c) x = 2. we get
= 0 from (a) + (b). Then k3 = 0 from (c) -2(b). Then kl = 0 from (b). This proves linear independence.
•
A better method for testing linear independence of solutions of ODEs will soon be explained.
General Solution. Basis
Solve the fourth-order ODE
/v Solution.
5y" + 4.'"
= 0
As in Sec. 2.2 we try and substitute .I' = e Ax . Omitting the common factor eA.'", we obtain the
characteristic equation
This is a quadratic equation in J.L ~ A2, namely,
The roots are J.L
interval is
=
1 and 4. Hence A
= -2. -I. I. 2. This give, four ,olutions. A general solution on any
provided those four solutions are linearly independent. This is true but will be shown later.
•
Initial Value Problem. Existence and Uniqueness
An initial value problem for the ODE (2) consists of (2) and
11
initial conditions
(5)
with given Xo in the open interval I considered. and given Ko, ... , KIl - 1.
In extension ofthe existence and uniqueness theorem in Sec. 2.6 we now have the following.
THEOREM 2
Existence and Uniqueness Theorem for Initial Value Problems
If the coefficients Po(x), ... , Pn-l(x) of (2) are continuous on some open interred I
and Xo is in I. then the initial value problem (2), (5) has a unique solution y(x) on 1.
Existence is proved in Ref. [All] in App. l. Uniqueness can be proved by a slight
generalization of the uniqueness proof at the beginning of App. 4.
108
E X AMP L E 4
CHAP. 3
Higher Order Linear ODEs
Initial Value Problem for a Third-Order Euler-Cauchy Equation
Solve the following initial value problem on any open intervall on the positive x-axis containing .T = 1.
Solution.
Step 1. General solution. As in Sec. 2.5 we try
/11(111 -
1)(111 -
2)~m
y = x"'.
- 3111(111 - Ih m
/'(1) = -4.
/(1)= I,
y(1) = 2,
+
611n
By differentiation and 'Llb,titution.
7n
-
6xm = O.
Dropping xm and ordering gives 111 3 - 6n? + 11m - 6 = O. If we can guess the root III = I. we can divide
by III - I and find the other roots 2 and 3, thus obtaining the solutions x. x 2 , x 3 . which are linearly independent
on 1 (see Example 2). [Tn general one shall need a root-finding method, such as Newton's (Sec. 19.2), also
available in a CAS (Computer Algebra System).] Hence a general solution is
3
valid on any interval I. even when it includes x = 0 where the coefficients of the ODE divided by x (to have
the standard form) are not continuous.
Step 2. Pwticular solution. The derivatives are.v' =
Cl
+ 2c2x + 3C3X2 and ,,"
=
2C2
+ 6C3X, From this and
y and the initial conditions we get by setting X = I
(a) yO)
= c1 +
C2
+
C3
+
2("2
+
3("3 =
2C2
+ 6c3
(b) l(1) = cl
(c) /'(1) =
=
=
2
-4.
This is solved by Cramer's rule (Sec. 7.6), or by elimination. which is simple, as follows. (b) - (a) gives
(d) C2 + 2C3 = -I. Then (e) - 2(d) gives C3 = -l. Then (e) gives C2 = I. Finally Cl = 2 from (a).
•
Answer: y = 2, + x 2 - x 3 .
Linear Independence of Solutions. Wronskian
Linear independence of solutions is crucial for obtaining general solutions. Although it
can often be seen by inspection. it would be good to have a criterion for it. Now Theorem
2 in Sec. 2.6 extends from order n = 2 to any n. This extended criterion uses the Wronskian
W of n solutions )'1, . . • , Yn defined as the nth order determinant
Note that W depends on x since )'1, . . . , )'n does. The criteIion states that these solutions
form a basis if and only if W is not zero: more precisely:
THEOREM 3
Linear Dependence and Independence of Solutions
Let the ODE (2) have continuous coefficients Po(x) . .. '. Pn-l(x) all all open
interval l. Then n solutions Yb ... , Yn of (2) on 1 are linearly depelldent on 1 if
and ollly if their Wronskian is zero for some x = Xo in T. Furtlzenl1ore, if W is zero jar
x = X o, thell W is identically zero Oil I. Hence if there is an Xl in 1 at which W is
Ilot ::,ero. thell Y1> ... , Yn are linearly indepel1dellf all I. so that they f01111 a basis
of solutions of (2) all T.
SEC 3.1
109
Homogeneous Linear ODEs
PROOF
(a) Let.\"1 ..... Yn be linearly dependent solutions of (2) on I. Then. by definition. there
are constants kI , . . . , k n not all zero, such that for all x in I.
(7)
By n - I differentiations of (7) we obtain for all x in I
=0
(8)
(7). (8) is a homogeneous linear system of algebraic equations with a nontlivial solution
kb ... , kn . Hence its coefficient determinant must be zero for every x on I, by Cramer's
theorem (Sec. 7.7). But that determinant is the Wronskian W, as we see from (6). Hence
W is zero for every x on I.
(b) Conversely, if W is zero at an Xo in I, then the system (7), (8) with x = Xo has a solution
kI *, ... , kn *, not all zero, by the same theorem. With these constants we define the
solution y* = kI*YI + .,. + kn*Yn of (2) on I. By (7), (8) this solution satisfies the
initial conditions y*(x o) = 0, ... , y*(n-l>(xo) = O. But another solution satisfying the
same conditions is y == O. Hence y* ==)' by Theorem 2, which applies since the coefficients
of (2) are continuous. Together, y* = kI*YI + ... + k n *)'n == 0 on I. This means linear
dependence of YI' .... )In on I.
(c) If W is zero at an Xo in T, we have linear dependence by (b) and then W == 0 by (a).
Hence if W is not zero at an Xl in I, the solutions YI, ... , Yn must be linearly independent
001.
•
E X AMP L E 5
Basis, Wronskian
We can now prove that in Example 3 we do have a basis. In evaluating W. pull out the exponential functions
columnwise. In the result. subtract Column I from Columns 2. 3. 4 lwithout changing Column I). Then
expand by Row 1. In the resulting third-order determinant. subtract Column I from Column 2 and expand
the result by Row 2:
e- 2 .-.::
e- x
eX
e
-2e -2x
-e -x
eX
2e 2x
-2
-2x
e- x
e·x
4e 2x
4
W=
4e
-8e -2x
-e-x
eX
2x
8e2x
3
-8
-I
4
2
-3
-3
7
9
0 = 72.
4
-I
16
•
8
A General Solution of (2) Includes All Solutions
Let us first show that general solutions always exist. Indeed. Theorem 3 in Sec. 2.6 extends
as follows.
THEOREM 4
Existence of a General Solution
If the coefficients Po(x), .. '. Pn-I(x) 0/(2) are continuous on some opell interval
I, then (2) lzas a general solution Oil I.
CHAP. 3
110
PROOF
Higher Order Linear ODEs
We choose any fixed Xo in I. By Theorem 2 the ODE (2) has n solutions )"1' •••• y",
where)J satisfies initial conditions (5) with K j - 1 = 1 and all other K"s equal to zero. Their
Wronskian at Xo equals 1. For instance, when 11 = 3, then Yl(XO) = 1, y~(xo) = 1,
y~(xo) = I, and the other initial values are zero. Thus, as claimed,
W(."ICt O)' .\'2(XO), Y3(XO»)
=
0
)'1(XO)
Y2(XO)
."3(XO)
Y~(xo)
y~(xo)
.\'~(xo)
0
y~(xo)
Y~Cto)
y~(xo)
0
0
0
1.
0
Hence for any n those solutions YI- .... Yn are linearly independent on I, by Theorem 3.
They form a basis on I. and y = CIY! + ... + CnY" is a general solution of (2) on I. •
We can now prove the basic prope11y that from a general solution of (2) every solution
of (2) can be obtained by choosing suitable values of the arbitrary constants. Hence an
nth order linear ODE has no singular solutions, that is, solutions that cannot be obtained
from a general solution.
THEOREM 5
General Solution Includes All Solutions
If the ODE (2) has continuolls coefficients Po(x). ... , P,,-1 (x) on some open interval
T, then e\'ery solution." = Y(x) of (2) 011 T is of the foml
(9)
where ."1' ... , y" is a basis of solutions of (2) on T al/d C 1, ... , C n are suitable
cOl/sta1/ts.
PROOF
Let Y be a given solution and y = ('1)'1 + ... + c"."n a general solution of (2) on I. We
choose any fixed Xo in I and show that we can find constants CI> •• '. Cn for which y and
its first 1/ - 1 derivatives agree with Y and its corresponding derivatives at xo. That is,
we should have at x = .\'0
+
=Y
,
+ cn )' n =
y'
(10)
But this is a linear system of equations in the unknowns Cl,
cn . Its coefficient
determinant is the Wronskian W of Yl, ... , Yn at xo. Since .\'1 •... , y" form a basis. they
are linearly independent, so that W is not zero by Theorem 3. Hence (10) has a unique
solution ('1 = CI> ... , Cn = Cn (by Cramer's theorem in Sec. 7.7). With these values
we obtain the particular solution
on I. Equation (10) shows that y* and its first n - I derivatives agree at xo with Yand
its corresponding derivatives. That is, y* and Y satisfy at xo the same initial conditions.
SEC. 3.2
Homogeneous Linear ODEs with Constant Coefficients
111
The uniqueness theorem (Theorem 2) now implies that y* theorem.
Y on T. This proves the
•
This completes our theory of the homogeneous linear ODE (2). Note that for
identical with that in Sec. 2.6. This had to be expected.
- .•...--.......-.
l ]~
- .. -
11. x
To get a feel for higher order ODEs. show that the given
functions are solutions and form a basis on any interval.
Use Wronskians. (In Prob. 2. x> 0.)
.1'iv = 0
l. I, x, x 2 , x 3 ,
2
2. 1, x , X4,
x2ylll - 3x.1''' + 3.1" = 0
3. eX, xe x . x 2e x .
ylll - 3.1''' + 3y' - .1' = 0
4. e2x cos x. e2x sin x, e- 2x cos x, e- 2x sin x,
yiv -
6y" + 25y = O.
S111
3x,
yiv
+ 9y"
=
0
6. TEAM PROJECT. General Properties of Solutions
of Linear ODEs. These properties are important in
obtaining new solutions ii'om given ones. Therefore
extend Team Project 34 in Sec. 2.2 to 11th-order ODEs.
Explore statements on sums and multiples of solutions
of (1) and (2) systematically and with proofs.
Recognize clearly that no new ideas are needed in this
extension from 11 = 2 to general 11.
7-191
LINEAR INDEPENDENCE
AND DEPENDENCE
Are the given functions linearly independent or dependent
on the positive x-axis? (Give a reason.)
7. I, eX, e- x
8. x + I. x + 2. x
9. In x, In x 2 • (In X)2
10. e", e- x , sinh 2x
3.2
=
2 it is
--
TYPICAL EXAMPLES OF BASES
5. I, x, cos 3x,
11
2
•
xlxl. x
12. x. I/x. 0
13. sin 2x. sin x. cos x
14. cos 2 x, sin 2 x, cos 2x
15. tan x. cot x. I
16. (x -
17. sin x, sin ~x
18. cosh x, sinh x, cosh 2 x
2
1)2. (x
+
1)2. x
2
19. cos x, sin x. 27T
20. TEAM PROJECT. Linear Independence and
Dependence. (a) Investigate the given question about
a set 5 of functions on an intervall. Give an example.
Prove your answer.
(I) If 5 contains the zero function, can 5 be linearly
independent?
(2) If 5 is linearly independent on a subinterval J of I.
is it linearly independent on l?
(3) If 5 is linearly dependent on a subinterval J of I.
is it linearly dependent on n
(4) If 5 is linearly independent on I, is it linearly
independent on a subinterval J?
(5) If 5 is linearly dependent on 1. is it linearly
independent on a subinterval J?
(6) If 5 is linearly dependent on I, and if T contains 5,
is T linearly dependent on l?
(b) In what cases can you use the Wronskian for
testing linear independence? By what other means can
you perform such a tcst'?
Homogeneous Linear ODEs with Constant
Coefficients
In this section we consider nth-order homogeneous linear ODEs with constant coefficients,
which we write in the form
(1)
where lnJ
d'\/dr: n • etc. We shall see that this extends the case 11 = 2 discussed in
Sec. 2.2. Substituting y = e AX (as in Sec. 2.2), we obtain the characteIistic equation
(2)
CHAP. 3
112
Higher Order Linear ODEs
of (1). If A is a root of (2), then y = e AX is a solution of (1). To find these roots, you may
need a numeric method, such as Newton's in Sec. 19.2, also available on the usual CASso
For general 11 there are more cases than for 11 = 2. We shall discuss all of them and
illustrate them with typical examples.
Distinct Real Roots
If all the 11 roots AI' .... An of (2) are real and different, then the 11 solutions
(3)
Yn
=e
An X
constitute a basis for all x. The corresponding general sulution of (1) is
(4)
Indeed, the solutions in (3) are linearly independent, as we shall see after the example.
E X AMP L E 1
Distinct Real Roots
Solve the ODE y'" - 2y" - y' + 2y = O.
2A2 - A + 2 = O. II has the roots - I, I, 2; if you find one
of them by inspection. you can obtain the other two roots by solving a quadratic equation (explain!). The
corresponding general solution (4) is y = Cle -.< + C2ex + C3e2x.
•
Soluti01l. The characteristic equation is A3
-
Linear Independence of (3). Students familiar with 11th-order determinants may verify
that by pulling out all exponential functions from the columns and denoting their product
by E, thus E = exp [(AI + ... + An)x], the Wronskian of the solutions in (3) becomes
eA1X
AleA1X
e AnX
e A2X
A2 e
A2X
An
eAnX
AI2eAIX
2 A2X
A2 e
An2eAnx
AI'-leA1X
A2:-leA2X
A;:-le AnX
W=
(5)
1
=E
Al
A2
An
AI2
A22
A,,2
A;"-1
A2:- 1
Ann - I
The exponential function E is never zero. Hence W = 0 if and only if the determinant on
the right is zero. This is a so-called Vandermonde or Cauchy determinantl It can be
shown that it equals
'ALEXANDRE THEOPHILE VANDERMONDE (1735-1796). French mathematician, who worked on
solution of equations by determinants. For CAUCHY see footnote 4, in Sec. 2.5.
SEC. 3.2
113
Homogeneous Linear ODEs with Constant Coefficients
(6)
where V is the product of all factors Aj - Ak withj < k (~ n); for instance, when II = 3
we get - V = -(AI - A2)(A 1 - A3)(A2 - A3). This shows that the Wronskian is not zero
if and only if all the n roots of (2) are different and thus gives the following.
THEOREM 1
Basis
x
Solutiolls YI = e AIX, • • • , )"n = /n of (1) ("with allY real or complex A/ s) f0l711 a
basis of solutiollS of (I) on any opell interml if and only if all /l roots of (2) are
different.
Actually, Theorem I is an important special case of our more general result obtained
from (5) and (6):
THEOREM 2
Linear Independence
Ally !lumber of solutions of (1) of the f0I711 e AX are linearly independent on an open
interl'ill I
if and only if the
correspondillg A are all differe11l.
Simple Complex Roots
If complex roots occur, they must occur in conjugate pairs since the c~efficients of (1)
are real. Thus, if A = 'Y + iw is a simple root of (2), so is the conjugate A = 'Y - iw, and
two corresponding linearly independent solutions are (as in Sec. 2.2, except for notation)
YI
E X AMP L E 2
=
Y2 = e1'X sin wx.
e1'X cos wx,
Simple Complex Roots. Initial Value Problem
Solve the initial value problem
,"," - y" + 100,", - 100y
=
yeo)
0,
=
y' (0)
4,
=
11,
.1'''(0) = -299.
The characteristic equation is A3 - A2 + 100A - 100 = O. It has the root 1, as can perhaps be
seen by in~peclion. Then division by A-I ~hows that the other roots are:+: lOi. Hence a general solution and
its derivatives (obtained by differentiation) are
Solutioll.
Y =
cle
x
+ A cos lOx + B sin lOx.
y'
= ('leX -
lOA sin lOx + lOB cos lOx.
y"
= C1ex -
100A cos IOle - 100B sin lOx.
From this and the initial conditions we obtain by setting x = 0
(a)
C1
+A
= 4.
(b)
c]
+
lOB = 11,
(c)
c1 -
100A = -299.
We solve this system for the unknowns A. B. ('1' Equation (a) minus Equation (c) gives lOlA
Then ('1 = I [mm (a) and B = I from lb). The solution is (Fig. 72)
y =
eX
+ 3 cos
This gi\e~ the ~olution curve. which o_cillate, ahout
eX
lOx
= 303, A = 3.
+ sin lOx.
(dashed in Fig. 72 on p. ll-l).
•
CHAP. 3
114
Higher Order Linear ODEs
y
20
10
4
°0~--~----~2----~3--X
Fig. 72.
Solution in Example 2
Multiple Real Roots
If a real double root occurs, say, Al = A2 , then)'1 = )'2 in (3), and we take)'1 and .\)'1 as
corresponding linearly independent solutions. This is as in Sec. 2.2.
More generally, if A is a real root of order 111, then 111 corresponding linearly independent
solutions are
(7)
We derive these solutions after the next example and indicate how to prove their linear
independence.
E X AMP L E 3
Real Double and Triple Roots
Solve the ODE yV - 3iv
Solution.
+ 3y'" - / '
=
O.
The characteristic equation ,,5 -
"3 = 4 = "5 = I, and the answer is
3A4 + 3A3
-
A2
= 0 has the roots
Al
=
A2
= 0 and
•
(8)
Derivation of (7). We write the left side of (\) as
L[yJ =
y<n)
+ an _ 1y<n-D + ... + aoy.
Let y = e A '. Then by performing the differentiations we have
Now let Al be a root of mth order of the polynomial on the right, where In ~ 11. For
< 11 let A""+1> ... , An be the other roots, all different from AI' Writing the polynomial
in product form. we then have
111
with h(A) = I if 111 = 11, and h(A) = (A - A71<+I) ..• (A - An) if III
key idea: We differentiate on both sides with respect to A.
(9)
<
II.
Now comes the
SEC 3.2
Homogeneous linear ODEs with Constant Coefficients
115
The differentiations with respect to x and A are independent and the occurring derivatives
are continuous, so that we can interchange their order on the left:
(10)
The right side of (9) is zero for A = Al because of the factors A - A} (and m
~
2 since
we have a multiple root!). Hence L[x/'X] = 0 by (9) and (10). This proves that X/IX is
a solution of (I).
We can repeat this step and produce x2/"'X, ... , X"'-I/IX by another 111 - 2 such
differentiations with respect to A. Going one step further would no longer give zero on
the right because the lowest power of A - Al would then be (A - A})o, multiplied by
m!h(A) and heAl)
0 because h(lI.) has no factors A - AI; so we get precisely the solutions
in (7).
We finally show that the solutions (7) are linearly independent. For a specific n
this can be seen by calculating their Wronskian, which turns out to be nonzero. For
arbitrary 111 we can pull out the exponential functions from the Wronskian. This gives
(eAx)m = e Amx times a determinant which by "row operations" can be reduced to the
Wronskian of 1. x . ... , X",-l. The latter is constant and different from zero (equal to
1 !2! ... (111 - I)!). These functions are solutions of the ODE /mJ = 0, so that linear
independence follows from Theroem 3 in Sec. 3.1.
•
*
Multiple Complex Roots
In this case, real solutions are obtained as for complex simple-.!oots above. Consequently,
if A = 'Y + iw is a complex double root, so is the conjugate A = 'Y - iw. Corresponding
linearly independent solutions are
(11)
e-yX cos wx,
e YX sin wx,
xeYX cos wx,
xeYX sin wx.
The fi~t two of these result from eAX and e Ax as before, and the second two from xe AT
and xe AX in the same fashion. Obviously, the corresponding general solution is
(12)
For complex triple roots (which hardly ever occur in applications), one would obtain
two more solutions x 2e AX cos wx, x 2e Yx sin wx, and so on.
11-61
ODE FOR GIVEN BASIS
Find an ODE (1) for which the given functions fonn a basis
of solutions.
1. eX, e 2X , e 3x
3. e:c , e- x • cos x. sin x
17-121
GENERAL SOLUTION
Solve the given ODE. (Show the details of your work.)
7. y'" + y' = 0
8. yiv - 29y" + LOOy = 0
4. cos x, sin x, x cos x, x sin x
9. y'" + y" - y' - y = 0
10. 16yiV - 8y" + Y = 0
5. I, x, cos 2x. sin 2x
6. e- 2x , e- x , eX, e 2x , I
11. ylll - 3)''' - 4y' + 6y
12. yiv + 3y" - 4y = 0
=
0
CHAP. 3
116
L13-18!
Higher-Order Linear ODEs
INITIAL VALUE PROBLEMS
Solve by a CAS, giving a general solution and the particular
solution and its graph.
13. lV + 0.45.\"'" - 0.165y" + 0.0045y' - 0.00175y = 0,
)"(0) = 17.4, y' (0) = -2.82. y"(O) = 2.0485.
y'''(0) = -1.458675
14. 4,,'"
+
8,,"
+
41 v'
+
37v = 0, 1'(0) = 9,
y"(O) = ~6.5, /i(O) = ....:39.75 .
+ 3.2y" + 4.81/ = 0, ,,(0) = 3.4,
/(0) = -4.6.y"(O) = 9.91
15. y'"
16. yiv + 4y = o. .\'(0) =
-,,"'(0) = -~
!, y' (0)
=
-!. y" (0)
= ~,
17. 1'iv - 91''' - 400" = O. ,,(0) = O. /(0) = O.
~"(O) ~ 4\. y'''(0) = 0 .
.
18.
y''' + 7.5.\"" + 14.25/ - 9.125-"
.\"(0)
=
=
0,
10.05, y' (OJ = -54.975,
y"(O) = 257.5125
11
program for calculating WronsJ..ians.
(b) Apply the program to some bases of third-order
and fourth-order constant-coefficient ODEs. Compare
3.3
Extend the solution method in Sec. 2.5 to any order
Solve X3y '" + 2x2 y" - 4xy' + 4y = 0 and another
ODE of your choice. In each case calculate the
Wronskian.
20. PROJECT. Reduction of Order. This is of practical
interest since a single solution of an ODE can often be
guessed. For second order. see Example 7 in Sec. 2.1.
(C)
11.
(a) How could you reduce the order of a linear
constant-coefficient ODE if a solution is known?
(b) Extend the method to a variable-coefficient ODE
.1''''
+ P2(xly" + PI(X)Y' + Po(x)y
=
o.
Assuming a solution YI to be known, show that another
solution is Y2(X) = U(X)YI(X) with u(x) = J z(x) dx and
.:: obtained by solving
)"1'::"
19. CAS PROJECT. Wronskians. Euler-Cauchy
Equations of Higher Order. Although Euler-Cauchy
equations have mriable coefficients (powers of x). we
include them here because they fit quite well into the
present methods.
(a) Write
the results with those obtained by the program most
likely available for Wronskians in your CAS.
+ (3y; + P2YI)'::' + (3y~ +
2P2 Y ;
+ PIYI)'::
=
O.
(e) Reduce
x 3)"'"
-
3x\" + (6 -
X2)X/ -
(6 - X2»)" = O.
using Yl = x (perhaps obtainable by inspection).
21. CAS EXPERIMENT. Reduction of Order. Starting
with a basis, find third-order ODEs with variable
coefficients for which the reduction to second order
turns out to be relatively simple.
Nonhomogeneous Linear ODEs
We now turn from homogeneous to nonhomogeneous linear ODEs of nth order. We write
them in standard form
(1)
/n)
+ Pn_I(X)y<n-D + ... + PI(X)'" + Po(x)y = rex)
with /n) = d'\ldx n as the first term, which is practical, and r(x) 'i= O. As for second-order
ODEs, a general solution of (I) on an open interval I of the x-axis is of the form
(2)
Here Yh(X) = CIYt(X)
homogeneous ODE
(3)
+ ... +
cny,,(x) is a general solution of the corresponding
/n) + Pn_I(X)/n-D + ... + PI(X)y' + Po(x)y = 0
on I. Also, Yp is any sulution of (l) on I containing no arbitrary constants. If (I) has
continuous coefficients and a continuous rex) on J, then a general solution of (1) exists
and includes all solutions. Thus (1) has no singular solutions.
SEC 3.3
Nonhomogeneous Linear ODEs
117
An initial value problem for (I) consists of (l) and
11
initial conditions
(4)
with Xo in I. Under those continuity assumptions it has a unique solution. The ideas of
proof are the same as those for n = 2 in Sec. 2.7.
Method of Undetermined Coefficients
Equation (2) shows that for solving (I) we have to determine a particular solution of (1).
For a constant-coefficient equation
(5)
(ao, ... , an - 1 constant) and special r(x) as in Sec. 2.7, such a )'p(x) can be detennined
by the method of undetermined coefficients. as in Sec. 2.7, using the following rules.
(A) Basic Rule as in Sec. 2.7.
(B) Modification Rule. If a term il1 your choice for )'p(x) is a solution of the
homogeneous equation (3), thel111l111tiply yp(x) by xk, where k is the smallest positive
integer such that no tenn of xkyp(X) is a solution of (3).
(C) Sum Rule as in Sec. 2.7.
The practical application of the method is the same as that in Sec. 2.7. It suffices to
illustrate the typical steps of solving an initial value problem and, in particular, the new
Modification Rule, which includes the old Modification Rule as a particular case (with
k = 1 or 2). We shall see that the technicalities are the same as for 11 = 2. perhaps except
for the more involved detennination of the constants.
E X AMP L E 1
Initial Value Problem. Modification Rule
Solve the initial value problem
(6)
y'"
+ 3y" + 3y' + Y = 30e- x ,
yeo)
= 3,
/(0) =
Step 1. The characteristic equation is A3 + 3A2 + 3'\
A = -I. Hence a general solution of the homogeneous ODE is
Solution.
Step 2. If we try Yp = Ce -x, we get -C + 3C - 3C
The Modification Rule calls for
+
1 = (A
3
)'~ = C(3x 2 - x3)e -x.
).; = C(6x - 6x
y;' = C(6 -
y"(0) =
+
-47.
1)3 = O. It has the triple root
+ C = 30, which has no solution. Try Cxe -x and Cx2e -x.
Yp = Cx e- x .
Then
-3,
I Sx
2
+ x 3 )e- x ,
+ 9x 2
-
x 3 )e -x.
118
CHAP. 3
Higher-Order linear ODEs
Substitlllion of these expressions into (6) and omission of the common factor e -x gives
The linear, quadratic. and cubic terms drop out. and 6C = 30. Hence C = 5. This gives yp = 5x 3 e- x .
Step 3. We now write down y = Jh + yp' the general solution of the given ODE. From it we find C1 by the
first initial condition. We insert the value. ditTerenliate, and determine c2 from the second initial condition. insert
the value, and finally determine ("3 from /'(0) and the third initial condition:
)"(0) = C1 =
/
y"
+ C2 +
= [-3
=
[3
Hence the
+
2c3
(-c2
+
2C3}X
+ (15 -
C3}X
2
2
5x je-
-
+ (30 - 4c3)x + (-30 + (3)X +
allswer 10
3
X
,
5x 3 je- x .
3
/(0)
=
-3
+ ("2 = -3.
/'(0) = 3 + 2c3 = -47.
C3 =
-25.
our problem is (Fig. 73)
The curve of y begins at (0, 3) with a negative slope. as expected from the initial values. and approaches zero
•
as x --'> ce. The da~hed curve in Fig. 73 is ypy
5
0
-5
5
\
10
x
Fig. 73. Y and Yp (dashed) in Example 1
Method of Variation of Parameters
The method of variation of parameters (see Sec. 2.10) also extends to arbitrary order 11.
It gives a particular solution Yp for the nonhomogeneous equation (1) (in standard foml
with y<n) as the first term!) by the formula
Yp(X) =
~
..c..
Yk(X)
k = 1
(7)
= Yl(X)
I
WI (x)
W(x) rex) dx
I
Wk(x)
- - rex) dx
W(x)
+ ... + Yn(X)
I
Wn(x)
W(x) rex) dx
on an open interval I on which the coefficients of (I) and rex) are continuous. [n (7) the
functions .1'1> •••• )'n form a basis of the homogeneous ODE (3), with Wronskian W. and
l1j (j = I, ... , 11) is obtained from W by replacing the jtb culumn of W by the column
[0 0
0 J]T. Thus, when 11 = 2. this becomes identical with (2) in Sec. 2.10,
.r 1
W=
/.
,
"
1
.2
"/, .
)'2
~I = ."1'
The proof of (7) uses an extension of the idea of the proof of (2) in Sec. 2.10 and can
be found in Ref [All] listed in App. I.
SEC. 3.3
Nonhomogeneous Linear ODEs
E X AMP L E 2
119
Variation of Parameters. Nonhomogeneous Euler-Cauchy Equation
Solve the nonhomogeneous Euler-Cauchy equation
(x> 0).
Solution.
Step 1. General solution of the homogeneous ODE. Substitution of)' = xm and the derivatives
into the homogeneous ODE and deletion of the factor xm give~
/11(/11 -
1)(1/1 - 2) - 3m(m - I) + 6111 - 6 = O.
The roots are L 2, 3 and give as a basis
x,
)'1 =
Hence the corresponding general solution of the homogeneous ODE is
)'h
Step 2. Determinants needed in (7). These
=
Cl x
+
C2 x
2
+
C3~
3
fiT
x
2x
3x2 = 2x 3
0
2
6x
0
x2
x
0
2,
3x
2
6x
W=
Wj
3
x2
x
x
W2 =
3
0
x3
0
3x 2
0
2
=x4
-2x 3
6x
2
0
2x
0
x
x
W3 =
0
= x2.
2
Step 3. Integration. In (7) we also need the right side rex) of our ODE in standard fonn. obtained by division
of the given equation by the coefficient x 3 of /"; thus, rex) = (x 4 In x)/.1' 3 = x In .1'. In (7) we have the simple
quotients W 1/W = x/2, W2 /W = -I, W3/W = 11(2,). Hence (7) becomes
Yp = x
I2
x x In x dx - x
( 3In
~ ~
x -
Simplification gives yp = ~x4 (in x
2I
3)
~
x In x dx + x
- x
2
(2~
3I
In x -
1
2x x In x dt
X2)
4
+
.1'3
2
(x In
X -
x).
-11'). Hence the answer is
.
Figure 74 shows Yv Can you explain the shape of this curve? Its behavior near x = O? The occurrence of
a minimum? Its rapid increase? Why would the method of undetermmed coefficients not have given the
~~~
120
CHAP. 3
Higher-Order Linear ODEs
y
30
20
10
0
-10
-20
x
Particular solution Yp of the nonhomogeneous
Euler-Cauchy equation in Example 2
Fig. 74.
Application:
10
J
-------"-
Elastic Beams
Whereas second-order ODEs have various applications, some of the more important ones
we have seen, higher order ODEs occur much more rarely in engineering work. An
important fourth-order ODE governs the bending of elastic beams, such as wooden or iron
girders in a building or a bridge.
Vibrations of beams will be considered in Sec. 12.3.
E X AMP L E 3
Bending of an Elastic Beam under a Load
We consider a beam B of length L and constant (e.g .. rectangular) cross section and homogeneous elastic
material (e.g .. ~teel): see Fig. 75. We assume that under its own weight the beam is bent so little that it is
practically straight. If we apply a load to B in a vertical plane through the axis of symmetry (the x-axis in
Fig. 75). B is bent. Its axis is curved into the so-called elastic curve C (or deflection curw). It is shown in
elasticity theory that the bending moment M(x) is proportional to the curvarure k(x) of C. We assume the bending
to be small, ~o that the deflection )"(x) and its derivative y' (X) (determining the tangent direction of C) are small.
Then. by calculus. k = y"I(1 + /2)312 = /'. Hence
M(x) = Ely"(x).
El is the constant of proportionality. E is Young's lIlodulus of elasticity of the material of the beam. 1 is the
moment of inertia of the cross section about the (horizontal) ~-axis in Fig. 75.
Elasticity theory shows further that M"(x) = f(x). where f(x) is the load per unit length. Together,
Elyiv = f(x).
(8)
~--;
--L
y
Undeformed beam
Z
Z
Fig. 75.
Deformed beam
under uniform load
(simply supported)
Elastic Beam
SEC. 3.3
Nonhomogeneous linear ODEs
121
The practically most important supports and corresponding boundary conditions are as follows (see Fig. 76).
(Al
Simply supported
y
= y" = 0 at x = 0 and L
(B)
Clamped at both ends
y
=
(C)
Clamped at x = 0, free at x = L
y'
= 0 at x = 0 and L
.1'(0) = y' (0) = 0, y"(L) = y"'(L) =
o.
The boundary condition y = 0 means no displacement at that point, y' = 0 means a horizontal tangent, v" = 0
means no bending moment. and y'" = 0 means no shear force.
Let us apply this to the uniformly loaded simply supported beam in Fig. 75. The load is i(x) "" io = const.
Then (8) is
k
(9)
io .
=
EI
This can be solved simply by calculus. Two integrations give
y"(0)
= 0 gives
c2
= O. Then
y"(L)
= L(~kL +
Y"
cl)
=
= 0,
=
Cl
-kLl2 (since L
'* 0). Hence
k 2 - Lx).
"2(X
Integrating this twice. we obtain
with
C4
= 0 from yeO) = O. Then
3
yeLl =
kL
2
(L12
_
L3
6
+ c )
= 0,
3
Inserting the expression for k, we obtain as our solution
Y =
io
24EI (x
4
- 2Lx
3
3
+ Lx).
Since the boundary conditions at both ends are the same. we expect the deflection y(x) to be "symmetric" with
respect to L12, that is, y(x) = y(L - x). Verify this directly or set.r = u + L12 and show that y becomes an
even function of u,
From this we can see that the maximum deflection in the middle at II
that the positive direction points downward.
= 0 (x =
CA) Simply supported
x=O
x=L
(B) Clamped at both
ends
x=L
x=0
x =L
Fig. 76.
(el Clamped at the left
end, free at the
right end
Supports of a Beam
4
L12) is 5i o L /(16 . 24EI). Recall
•
122
CHAP. 3
Higher-Order Linear ODEs
. . 11 HIE -M-=:3 E T 1-;:-3=______
11.:-iil
GENERAL SOLUTION
Solve the following ODEs. (Show the details of your work.)
1. y'" - 2y" - 4/
+
8y = e- 3 ,'
+
8\"2
+ 3y" - 5/ - 39y = 30 cos x
+ 0.5y" + 0.0625y = e- x cos 0.5x
4. ,,'" + 2,," - 5y' - 6y = 100e- 3x + 18e- x
5. x 3y'" + 0.75x),· - 0.75.\" = 9X 5 . 5
2. y'"
3.
yiv
6. (x0 3 + 4D2)y = 8e x
7. (D 4
+
IOD 2
+
9/)y = 13 cosh 2x
+ 18/))' = e 2x
8. (0 3 - 2D2 - 9D
19-141
INITIAL VALUE PROBLEMS
Solve the following initial value problems. (Show the
details.)
9. rIll - 9\"" + 27/ - 27\" = 54 sin 3x.
:\"' (0) ~ U.S, . y" (0) ;", 38.5
yeO)
= 3.5,
= 128 cosh 2x, yeO) = 1, ),'(0) = 24.
/"(0) = -HiO
11. (x 3 D3 - x 2 D2 - 7xD + 16/)y = 9x In x.
yO) = 6. Oy(!) = 18, D2y(l) = 65
12. (0 4 - 26D2 + 25/)y = 50(x + 1)2, yeO) = 12.16,
Dy(O) = -6. D2y(0) = 34. D 3 \"lO) = -130
10.
)'iv -
16y
y"(O)
=
20,
=
•
1. What is the superposition or linearity principle? For
what 11th-order ODEs does it hold?
2. List some other basic theorems that extend from
second-order to 11th-order ODEs.
3. If you know a general solution of a homogeneous linear
ODE. what do you need to obtain from it a general
solution of a corresponding nonhomogeneous linear
ODE?
4. What is an initial value problem for an 11th-order linear
ODE?
5. What is the Wronskian? What is it used for?
16-151
GENERAL SOLUTION
Solve the given ODE. (Show the details of your work.)
6. ylll + 6y" + 18y' + 40y = 0
7. 4x 2-,,'" + 12x-,," + 3-,,' = 0
8. yiv + lOy" + 9y = 0
9. 8y'" + 12y" - 2)"
-
3)'
=
0
10. (D 3 + 3D 2 + 3D + I)' = x 2
13. (D 3 + 40 2 + 850)y = 135xe x , yeO) = 10.4.
Dy(O) = -18.1, D2y(0) = -691.6
14. (2D 3 - 0 2 - 8D + 4/)y = sin x. yfO) = I,
Dy(O) = O. D2yfO) = 0
15. WRITING PROJECT. Comparison of Methods.
Write a report on the method of undetermined coefficients
and the method of variation of parameters. discussing and
comparing the advantages and disadvantages of each
method. Illustrate your findings with typical examples.
Try to show that the method of undetermined coefficients.
say. for a third-order ODE with constant coefficients and
an exponential function on the right, can be derived from
the method of vmlation of parameters.
16. CAS EXPERIMENT. Undetermined Coefficients.
Since variation of parameters is generally complicated,
it seems worthwhile to try to extend the other method.
Find out experimentally for what ODEs this is possible
and for what not. Hint: Work backward. solving ODEs
with a CAS and then looking whether the solution
could be obtained by undetermined coefficients. For
example. consider
y'" 3
x y'"
12,r" + 48/ - 64y = x l12 e 4x
+x
2
-,," -
6xy'
+
and
6,v = x In x.
TIONS AND PROBLEMS
11. (xD 4
12. (D
4
= 150x 4
3
2D - SD2)y = 16 cos 2x
l)y = ge xl2
+ 03)y
-
13. lD3 +
14. (x 3D3 - 3x 202 + 6xD - 61)y = 30x- 2
15. (D 3 - D2 - D + /)' = eX
116-201
INITIAL VALUE PROBLEMS
Solve the given problem. (Show the details.)
16. y'" - 2-,," + 4/ - 8y = O. yeO) = -I,
y' (0) = 30. y" (0) = 28
17. x 3,.,,'" + 7x 2"',," -I f2xv'
- 10"
= O.
yO)
l.
·
Y (I) = - 7 • ." (l) = 44
18. (D 3 + 25D)y = 32 cos 2 4x, yeO) = 0,
Dy(O) = 0, D2y(0) = 0
19. (D4 + 40D 2 - 441I)y = 8 cosh x. yeO) = 1.98,
Oy(O) = 3, 02y (0) = -40.02. D 3 y(0) = 27
20. (x 3D3 + 5x 2D2 + 2xD - 2/)y = 7x 3/2 ,
y(l) = 10.6,
Dy(l) = -3.6,
D2y(l)
=
31.2
Summary of Chapter 3
123
Higher Order Linear ODEs
11 = 2).
Chapter 3 extends Chap. 2 from order 11 = 2 to arbitrary order 11. An nth-order
linear ODE is an ODE that can be written
Compare with the similar Summary of Chap. 2 (the case
(1)
In)
+ Pn_1(X)/n-1)
+ ... +
P1(X)/
+ Po(x)y = r(x)
with y(n) = dny/dxn as the first term; we again call this the standard form. Equation
(I) is called homogeneous if r(x) == 0 on a given open interval 1 considered,
nonhomogeneous if r(x) =1= 0 on 1. For the homogeneous ODE
/n)
(2)
+ Pn_1(X)/n-ll + ... -, P1(X)y' + Po(.x)y = 0
the superposition principle (Sec. 3.1) holds, just as in the case 11 = 2. A basis or
fundamental system of solutions of (2) on I consists of 11 linearly independent
solutions Yi, ... ,.\"n of(2) on I. A general solution of(~) on lis a linear combination
of these,
(3)
r=cr
1. 1 +"'+cr
n. n
(Cb . . . , C n
.
arbitrary constants).
A general solution of the nonhomogeneous ODE (1) on I is of the form
(4)
Y
=
Yh
+ Yp
(Sec. 3.3).
Here, Yp is a particular solution of (1) and is obtained by two methods
(undetermined coefficients or variation of parameters) explained in Sec. 3.3.
An initial value problem for (I) or (2) consists of one of these ODEs and 11
initial conditions (Secs. 3.1, 3.3)
(5)
with given Xo in I and given Ko, .... K Il - l . If Po • .... Pn-1o r are continuous on
I. then general solutions of (I) and (2) on J exist. and initial value problems (I).
(5) or (2). (5) have a unique solution.
••••
ti
"\
~/
CHAPTER
~
./
4
~1
,
V"
"
...
.:. ;-.~
~
...
.. ~
+-.
~-
..
...:c..
Systems of ODEs. Phase Plane.
Qualitative Methods
Systems of ODEs have various applications (see, for instance, Secs. 4.1 and 4.5). Their
theory is outlined in Sec. 4.2 and includes that of a single ODE. The practically important
conversion of a single nth-order ODE to a system is shown in Sec. 4.1.
Linear systems (Secs. 4.3, 4.4, 4.6) are best treated by the use of vectors and matrices,
of which, however, only a few elementary facts will be needed here, as given in Sec. 4.0
and probably familiar to most students.
Qualitative methods. In addition to actually solving systems (Sec. 4.3, 4.6), which is
often difficult or even impossible, we shall explain a totally different method, namely, the
powerful method of investigating the general behavior of Whole families of solutions in
the phase plane (Sec. 4.3). This approach to systems of ODEs is called a qualitative
method because it does not need actual solutions (in contrats to a "quantitative method"
of actually solving a system).
This phase plane method, as it is called, also gives information on stability of ~olutions.
which is of general importance in control theory, circuit theory, population dynamics, and
so on. Here, stability of a physical system means that, roughly speaking, a small change
at some instant causes only small changes in the behavior of the system at all later times.
Phase plane methods can be extended to nonlinear systems, for which they are
particularly useful. We will show this in Sec. 4.5, which includes a discussion of the
pendulum equation and the Lotka-Volterra population model. We finally discuss
nonhomogeneous linear systems in Sec. 4.6.
NOTATION. Analogous to Chaps. 1-3, we continue to denote unknown functions by
x for functions, Xl (t), X2(t),
as is sometimes done in systems of ODEs.
y; thus, YI (I), h(!)· This seems preferable to suddenly using
Prerequisite: Chap. 2.
References and Ansll'ers
4.0
10
Problems: App. 1 Part A. and App. 2.
Basics of Matrices and Vectors
In discussing li1/ear systems of ODEs we shall use matrices and vectors. This simplifies
formulas and clarifies ideas. But we shall need only a few elementary facts (by no means
the bulk of material in Chaps. 7 and 8). These facts will very likely be at the disposal of
most students. Hence this sectio1l is for reference only. Begin with Sec. 4.1 and consult
4.0 as needed.
124
SEC 4.0
Basics of Matrices and Vectors
125
Most of our linear systems will consist of two ODEs in two unknown functions .' let).
)'2(t),
for example,
(1)
(perhaps with additional given functions .Rl(t), g2(t) in the two ODEs on the right),
Similarly, a linear system of n first-order ODEs in n unknown functions YI(t).
)'n(t) is of the fonn
(2)
(perhaps with an additional given function in each ODE on the right).
Some Definitions and Terms
Matrices.
In (I) the (constant or variable) coefficients form a 2 x 2 matrix A, that is.
an anay
A= [-5 2] .
for example,
13
~
Similarly, the coefficients in (2) form an n x n matrix
(4)
The (lIb (/12' . . . are called entries, the horizontal lines rows, and the vertical lines
columns. Thus, in (3) the first row is [al1 a12]' the second row is [a2l a22], and the
first and second columns are
and
a
12
]
.
[ a22
In the "double subscript notation" for entries, the first subscript denotes the row and the
second the column in which the entry stands. Similarly in (4). The main diagonal is the
diagonal an a22
ann
in (4), hence all {/22 in (3).
We shall need only square matrices, that is, matrices with the same number of rows
and columns, as in (3) and (4).
126
CHAP. 4
Vectors.
Systems of ODEs. Phase plane. Qualitative Methods
A column vector x with n components Xl,
x=
. . . , Xn
is of the form
thus if 11 = 2.
Similarly. a row vector v is of the form
thus if 11
= 1. then
Calculations with Matrices and Vectors
Equality. Two 11 X
Thus for 1l = 2. let
A=
11
matrices are equal if and only if corresponding entries are equal.
a12 ]
[ all
a21
and
[b
B=
ll
b 21
a22
12
b ]
b 22
Then A = B if and only if
all
= bu.
a12
= bI2
a21
=
b2I ,
a22
= b 22 ·
Two column vectors (or two row vectors) are equal if and only if they both have
components and corresponding components are equal. Thus. let
Then
v
=x
11
if and only if
Addition is performed by adding corresponding entries (or components); here, matrices
must both be II X 11, and vectors must both have the same number of components. Thus
for n = 2,
(5)
Scalar multiplication (multiplication by a number c) is performed by multiplying each
entry (or component) by c. For example, if
[-: :].
A=
then
-7A =
[-63
14
If
v
~
[04].
-13
then
IOv =
[-:30}
-2~J
.
SEC. 4.0
Basics of Matrices and Vectors
127
Matrix Multiplication_ The product C = AB (in this order) of two n X n matrices
A = [ajk] and B = [bjk ] is the n X n matrix C = [Cjk] with entries
n
(6)
Cjk
j
2:
=
aj'mb'mk
= 1, ... , n
k = 1, ... , n,
'm~1
that is, multiply each entry in the jth row of A by the corresponding entry in the kth column
of B and then add these n products. One says briefly that this is a "multiplication of rows
into columns." For example,
3J[I-4J
0
2
5
9
[ -2
[9-1+3-2
-2 - 1 + 0 - 2
~ [~:
(-2)-(-4)
I
-4J [ 9
2
5
- 2
3J
0
=1=
BA in general. In our
1 -3
[1-9 + (-4)-(-2)
=
+ 0-5
-2:J
Matrix multiplication is not commutative, AB
CAUTION!
example,
[
9-(-4) + 3-5J
+ (-4) - OJ
2-3+5-0
2 - 9 + 5 - (- 2)
Multiplication of an n X n matrix A by a vector x with n components is defined by the
same rule: v = Ax is the vector with the n components
n
Vj
2:
=
j = I, - - -, n.
ajmx'm
'm~1
For example,
Systems of ODEs as Vector Equations
Differentiation_ The derivative of a matrix (or vector) with variable entries (or
components) is obtained by differentiating each entry (or component). Thus, if
yet) =
Yl(t)]
[ Y2(t)
=
[e-2t]
,
I
Y (t) =
then
sin t
.Vl(t)]
1
[-2e-2t]
=
.
[ Y~(t)
cos t
Using matrix multiplication and differentiation, we can now write (1) as
I
(7)
y
=
[ ']
Yl
I
Y2
=
Ay =
[
all
a 21
12
a ]
a22
[VI]
.
Y2
,
e.g.,
y' =
[-5
13
128
CHAP. 4
Systems of ODEs. Phase Plane. Qualitative Methods
Similarly for (2) by means of an n X n matrix A and a column vector y with II components,
namely, y' = Ay. The vector equation (7) is equivalent to two equations for the
components. and these are precisely the two ODEs in (1).
Some Further Operations and Terms
Transposition is the operation of writing columns as rows and conversely and is indicated
by T. Thus the transpose AT of the 2 X 2 matrix
a
12
a22
J [-5 2J!
-
is
13
The transpose of a column vector, say,
is a row vector,
and conversely.
Inverse of a Matrix. The n X n unit matrix I is the 11 X 11 matrix with main diagonal
1, 1, ... ,land all other entries zero. If for a given n X n matrix A there is an n X 11
matrix B such that AB = BA = I, then A is called nonsingular and B is called the inverse
of A and is denoted by A -1; thus
(8)
If A has no inverse, it is called singular. For
(9)
11
= 2,
A-I =
det A
where the determinant of A is
(10)
detA =
a 11
l
1I2I
(For generaln, see Sec. 7.7, but this will not be needed in this chapteL)
Linear Independence. r given vectors v(1), ... , VCT) with n components are called a
linearly independent set or, more briefly, linearly independent, if
(11)
C1 VCl)
+ ... + c,.vCr) = 0
implies that all scalars c 1 , • • . , c,. must be zero; here, 0 denotes the zero vector, whose
n components are all zero. If (II) also holds for scalars not all zero (so that at least
one of these scalars is not zero), then these vectors are called a linearly depel1dent set
or, brietly, linearly dependent, because then at least one of them can be expressed as
SEC. 4.0
129
Basics of Matrices and Vectors
a linear combination of the others; that is, if, for instance,
can obtain
v(1)
= - ~
(C2 VC2 )
C1
* 0 in (ll), then we
+ ... + crvc'·)).
C]
Eigenvalues, Eigenvectors
Eigenvalues and eigenvectors will be very important in this chapter (and, as a matter of
fact, throughout mathematics).
Let A = [Ujk] be an n X n matrix. Consider the equation
Ax = AX
(12)
where A is a scalar (a real or complex number) to be determined and x is a vector to be
determined. Now for every A a solution is x = O. A scalar A such that (12) holds for some
vector x
0 is called an eigenvalue of A, and this vector is called an eigenvector of A
corresponding to this eigenvalue A.
We can write (12) as Ax - AX = 0 or
*
(A - AI)X
(13)
=
O.
These are n linear algebraic equations in the n unknowns Xl> ••• , Xn (the components of
x). For these equations to have a solution x
0, the determinant of the coefficient matrix
A - AI must be zero. This is proved as a basic fact in linear algebra (Theorem 4 in
Sec. 7.7). In this chapter we need this only for n = 2. Then (13) is
*
(14)
in components,
=0
(14*)
Now A - AI is singular if and only if its determinantdet (A - AI), called the characteristic
determinant of A (also for general n), is zero. This gives
det (A _ AI)
=
I
Uu -
U21
A
I
U12
U22 -
A
(15)
This quadratic equation in A is called the characteristic equation of A. Its solutions are
the eigenvalues Al and A2 of A. First determine these. Then use (14*) with A = Al to
determine an eigenvector xCi) of A cOlTesponding to A1' FinaIly use (14*) with A = A2 to
find an eigenvector X(2) of A cOlTesponding to A2' Note that if x is an eigenvector of A,
so is h for any k
O.
*
CHAP. 4
130
E X AMP L E 1
Systems of ODEs. Phase plane. Qualitative Methods
Eigenvalue Problem
Find the eigenvalues and eigenvectors of the maliix
-4.0
A= [
(16)
Soluti01l.
4.0J.
-1.6
1.2
The characteristic equation is the quadratic equation
-4 - A
det [A -
All
=
1 -1.6
4
1 = A2
1.2 - A
+ 2.8A +
1.6 = O.
It has the solutions Al = -2 and A2 = -0.8. These are the eigenvalues of A.
Eigenvectors are obtained from (14*). For A = Al = -2 we have from (14*)
(-4.0 + 2.0)x1 +
+
=0
(1.2
+ 2.0)x2 =
O.
A solution of the fir~t equation is Xl = 2, X2 = I. This also satisfies the second equation. (Why?). Hence an
eigenvector of A corresponding to Al = -2.0 is
(17)
X(2)=[I]
Similarly.
0.8
is an eigenvector of A corresponding to A2
=
-0.8. as obtained from (14") with A = A2. Verify this.
•
4.1 Systems of ODEs as Models
We first illustrate with a few typical examples that systems of ODEs can serve as models
in various applications. We further show that a higher order ODE (with the highest
derivative standing alone on one side) can be reduced to a first-order system. Both facts
account for the practical importance of these systems.
E X AMP L E 1
Mixing Problem Involving Two Tanks
A mixing problem involving a single tank is modeled by a single ODE. and you may first review the
corresponding Example 3 in Sec. 1.3 because the principle of modeling will be the same for two taoks. The
model will be a system of two first-order ODEs.
Taok T1 aod T2 in Fig. 77 contain initially 100 gal of water each. In T1 the water is pure, whereas 150 I b of
fertilizer are dissolved io T2. By circulating liquid at a rate of 2 gal/min and stirring (to keep the mixture uniform)
the amounts of fertilizer ,vI(t) III T1 and Y2(t) in T2 change with time t. How long should we let the liqUid circulate
so that T1 will contain at least half as much fertilizer as there will be left in T2?
Setting up the model. As for a single tank. the time rate of change \.~ (t)
inflow minus outflow. Similarly for tank T 2 . From Fig. 77 we see that
Soluti01l. Step I.
y~ = Inflow/min - Outflow/min =
2
100.1'2 -
2
}'~ = Inflow/min - Outflow/min = 100
Of)'l (t)
equal&
2
IOOY1
2
\'1 -
100 Y2
Hence the mathematical model of our mixture problem i, the system of first-order ODEs
)'~ = -0.02Y1
+ 0.02)'2
(Tank T1 )
(Tank T2 ).
SEC. 4.1
131
Systems of ODEs as Models
yet)
150
-
r
2 gal/min
-
2 gal/min
T]
'--
100
----T2
System of tanks
Fig. 77.
Fertilizer content in Tanks T, (lower curve) and T2
As a vector equation with column vector y =
[YIJ and matrix A this becomes
Y2
-0.02
y' = Ay,
0.02J.
A= [
0.02 -0.02
where
Step 2. General solution. As for a single equation, we try an exponential function of t,
Then
(I)
Dividing the last equation AxeAt = Axe).' by eAt and interchanging the left and right sides, we obtam
Ax = Ax.
We need nontIivial solutions (solutions that are not identically zero). Hence we have to look for eigenvalues
and eigenvectors of A. The eigenvalues are the solutions of the characteristic equation
(2)
1
I=
-0.02 - A
0.02
0.02
-O.oz - A
det (A - AI) =
(-O.oz - A)2 - 0.022 = A(A + 0.04) = O.
We see that Al = 0 (which can very well happen--don't get mixed up-it is eigenvectors that must not be zero)
and A2 = -0.04. Eigenvectors are obtained from (14*) in Sec. 4.0 with A = 0 and A = 0.04. For our present
A this gives [we need only the fIrst equation in (14*)J
-0.02'1 + 0.02'2 = 0
(-0.02 + 0.04)Xl + 0.02.\2
and
=
0,
respectively. Hence Xl = X2 and Xl = -x2, respectively, and we can take Xl = x2 = 1 and Xl
This gives two eigenvectors corresponding to Al = 0 and A2 = -0.04, respectively, namely,
=
-x2 =
1.
and
From (I) and the superposition principle (which continues to hold for systems of homogeneous linear ODEs)
we thus obtain a solution
(3)
where
Cl
and
C2
are arbitrary constants. Later we shall call this a general solution.
Step 3. Use of initial conditions. The initial conditions are yt(O) = 0 (no fertilizer in tank T 1) and Y2(0) = 150.
From this and (3) with t = 0 we obtain
[:: : ::J
132
CHAP. 4
Systems of ODEs. phase plane. Qualitative Methods
In components this is
In
cl
+
c2
= 0, CI
-
c2
= I SO. The solution is Cl = 7S. ("2 = -7S. This gives the answer
component~,
)'1
= 75 - 7Se- O.04t
(Tank T I , lower curve)
+ 75e- o.04t
(Tank T2 • upper curve>.
)'2 =
75
Figure 77 shows the exponential increase of \'1 and the exponential decrease of .\'2 to the common limit 75 lb.
Did you expect this for physical reasons? Can you physically explain why the curves look "symmetric"? Would
the limit change if TI initially contained 100 Ib of fertilizer and T2 contained 50 Ib?
Step 4. Answer. T1 contains half the fertilizer amount of T2 if it contains 113 of the total amount, that is,
SO lb. Thus
YI
= 75 - 75e -O,Mt = SO,
e
-O.Mt
1
t
= 3'
= (In 3)/0.04 = 27,5,
•
Hence the fluid should circulate for at lea,t about half an hour.
EXAMPLE 2
Electrical Network
Find the CUlTents ft (t) and 12 (1) in the network in Fig. 78. Assume all
the instant when the switch is closed.
L = 1 henry
current~
and charges to be zero at t
=
0,
C = 0.25 farad
SWitchlrI;
t=O
E = 12 volts-=-
R2 = 6 ohms
Fig. 78.
Electrical network in Example 2
Solution.
Step I. Setting up the mathematical model. The model of this network is obtained from
Kirchhoff's voiLage law, as in Sec, 2.9 (where we considered single circuits). Let II(t) and 12(t) be the CUiTents
in the left and right loops, respectively, In the left loop the voltage drops are Ll~ = I; [V] over the inductor
and RI(ft - 12) = 4(h - 12 ) [V] over the resistor, the difference because 11 and 12 flow through the resistor
in opposite directions. By Kirchhoff's voltage law the sum of these drops equals the voltage of the battery; that
is, I; + 4(/1 - 12 ) = 12. hence
(4a)
I;
= -4ft
+ 4/2 + 12.
In the right loop the voltage drops are R2/2 = 612 [V] and R 1 (l2 - 'I)
(I/e)f 12 dt = 4 f 12 dt [V] over the capacitor. and their sum is zero.
or
=
4(12 - 11) [V] over the resistors and
1012 - 4ft
+4
f '2
dt = O.
Division by 10 and differentiation gives I~ - O.4/~ + 0.4/2 = O.
To simplify the solution process. we first get rid of 0.4/~, which by (4a) equals 0.4(-4/1
Substitution into the present ODE gives
I~ = OAli - 0.4/2 = OA( -4ft
+
4/2
+
12) - 0.4/2
+
4/2
+
12).
SEC. 4.1
131
Systems of ODEs as Models
and by simplification
I~ = -1.6/ 1
(4b)
+
1.212
+ 4.8.
In matrix form. (4) is (we write J since I is the unit matrix)
J'
(5)
= AJ
+ g,
A=
where
-4.0 4.0J,
[-1.6 l.2
g =
[12.0J .
4.8
Step 2. Solving (5). Because of the vector g thi~ is a nonhomogeneous system. and we try to proceed as for
a single ODE. solving lim the homogeneous system J' = AJ (thu~.J' - A.J = 0) by substituting J = xeAl.
This gives
hence
Ax
=
Ax.
Hence to obtain a nontrivial solution, we again need the eigenvalues and eigenvectors. For the present matrix
A [hey are derived in Example I in Sec.
4.0:
X(2)
=
[
I ]
0.8
Hence a "general solution" of [he homogeneous system is
For a particular
~olution
of the nonhomogeneous
sy~tem
(5). since g is constant. we try a constant column vector
J p = a with components "I' 112' Then J~ = 0, and sub~titution into (5) gives Aa + g = 0; in components.
+ 4.0112 +
12.0 = 0
-1.6aI + 1.2l12 +
4.8 = O.
-4.0111
The solution is
al
= 3, {l2 = 0: thus a =
[~J
.
Hence
(6)
in components.
The initial conditions give
('2+
Hence
('1
=
-4 and
('2
3 =0
= 5. As the solution of our problem we thus obtain
(7)
In components (Fig. 79b),
11 =
-8e- 2t + 5e- o.8t + 3
12 = _4e- 2t + 4e- O.8t .
Now collles an important idea, un which we ~hall elaborate further. beginning in Sec. 4.3. Figure 79a shows
and 12 (t) as two separate curves. Figure 79b shows these two currents as a single curve [ft(l), 12 (t)] in the
1I/2-plane. Thi~ is a parametric representation with time t as the parameter. It is often important to know in
which sense such a curve is traced. This can be indicated by an arrow in the sense of increasing t. as is shown.
The 1I/2-plane is called the phase plane of our system (5), and the curve in Fig. 79b is called a trajectory. We
shall see [hat such "phase plane representations" are far more important than graphs as in Fig. 79a because
they will give a much better qualitative overall impression of the general behavior of whole familie~ of solutions,
•
not merely of one solution as in the present case.
II(t)
CHAP. 4
134
Systems of ODEs. Phase plane. Qualitative Methods
1(t)
i'--~
4
3
~--------------===
2
0.5
/
OL-~L-
o
OL-__L -_ _L -_ _L -_ _~_ _~_ __ _
o
2
3
4
5
_ _L -_ _L -_ _ _ _~~_
2
3
4
5
(a) Currents 11
(b)
(upper curve)
and 12
Fig. 79.
Trajectory 1I1(t), 12(t)]T
in the 1/2 -plane
(the "phase plane")
Currents in Example 2
Conversion of an nth-Order ODE to a System
We show that an nth-order ODE ofthe general form (8) (see Theorem 1) can be converted
to a system of n first-order ODEs. This is practically and theoretically important-practically because it permits the study and solution of single ODEs by methods for
systems. and theoretically because it opens a way of including the theory of higher order
ODEs into that of first-order systems. This conversion is another reason for the importance
of systems, in addition to their use as models in various basic applications. The idea of
the conversion is simple and straightforward, as follows.
THEOREM 1
Conversion of an ODE
An nth-order ODE
y<n)
(8)
=
F(t, y, y', ... , y<n-ll)
call be converted to a system of n first-order ODEs by setting
(9)
Yl
= y,
Y2
= y',
)'3
= y",' .. , Yn =
y<n-ll.
This system is of the form
,
Yl
,
= Y2
)'2 =)'3
(10)
,
Yn-l
y~
PROOF
=
=
Yn
F(t, Y10 Y2, ... , Yn)·
The first n - 1 of these n ODEs follow immediately from (9) by differentiation. Also,
y~ = y<n) by (9), so that the last equation in (10) results from the given ODE (8).
•
SEC. 4.1
135
Systems of ODEs as Models
E X AMP L E 3
Mass on a Spring
To gain confidence in the conversion method, let us apply it to an old friend of ours. modeling the free motions
of a mass on a spring (see Sec. 2.4)
my"
+
+ ky
cy'
0
=
k
-yo
y,
y " = - -c
m
or
III
For this ODE (8) the system (10) is linear and homogeneous,
,
Y1 = .1'2
,
k
c
Y2 = - - Y1 - m
III
Setting y
=
)"2'
[Y1J .we get in matrix form
Y2
O
y'
=
Ay
k
= [_
cll [::J .
_
m
III
The characteristic equation is
-A
det (A - Ali
=
= A2
c
m
k
--- A
111
III
It agrees with that in Sec. 2.4. For an illustrative computation, let
A2
+
2A
+ 0.75
= (A
+ .!:... A + ~
III
+ 0.5)(A +
= O.
111
= I, c = 2, and k = 0.75. Then
1.5)
=
O.
This gives the eigenvalues Al = -0.5 and A2 = -1.5. Eigenvectors follow from the first equation Lll
A - AI = 0, which is -A,y] +.x2 = O. For A] this gives 0.5x1 + x2 = O. say. xl = 2. '\2 = -1. For A2 = -1.5
it gives l.5XI + -'"2 = 0, say, Xl = I, X2 = -1.5. These eigenvectors
X<2l=[
I
-1.5
J
give
e
Y- [2J
-1
-
c1
-0.5t
+ (2. [
1 Je.
-1.5t
-1.5
This vector solution has the first component
which is the expected solution. The second componenl is its derivative
.1'2
----_
=
yi =
y' = _c]e- 0 .5t
-
•
1. 5c2 e -1.5t.
....
-
11-61
MIXING PROBLEMS
1. Find out without calculation whether doubling the flow
rate in Example 1 has the same effect as halfing the
tank sizes. (Give a reason.)
2. What happens in Example 1 if we replace T2 by a tank
containing 500 gal of water and ISO Ib of fertilizer
dissolved in it?
3. Derive the eigenvectors
consulting this book.
III
Example
1 without
4. In Example 1 find a "general solution" for any ratio
a = (flow rate)/(tank si:;;e), tank sizes being equal.
Comment on the result.
5. [f you extend Example I by a tank T3 of the same size
as the others and connected to T2 by two tubes with
CHAP. 4
136
Systems of ODEs. Phase Plane. Qualitative Methods
16. TEAM PROJECT. Two Masses on Springs. (a) Set
up the model for the (undamped) system in Fig. 80.
flow rates a~ between T1 and T2 , what system of ODEs
will you get?
6. Find a "general solution" of the system in Prob. 5.
17-10
I
(b) Solve the ~ystem of ODEs obtained. Him. Try
2
= xe'"' and set w = A. Proceed as in Example I or 2.
y
ELECTRICAL NETWORKS
(e) Describe the influence of initial conditions on the
possible kind of motions.
7. Find the currents in Example 2 if the initial cun-ents
are 0 and - 3 A (minus meaning that 12 (0) flows against
the direction of the an-ow).
8. Find the cun-ents in Example 2 if the resistance of R1
and R2 is doubled (general solution only). First, guess.
9. What are the limits of the CUlTents in Example 27
Explain them in terms of physics.
10. Find the cun-ems in Example 2 if the capacitance is
changed to C = 115.4 F (farad).
111-151
(Net change in
spring length
CONVERSION TO SYSTEMS
=Y2- Y l)
Find a general solution of the given ODE (a) by first
converting it to a system. (b). as given. (Show the detaib
of your work.)
II. y" - 4y = 0
12. y" + 2y' - 24y = 0
13. y" - y' = 0
14. y" + 15y' + SOy = 0
15. 64y" - 48/ - 7."
=
System in
static
equilibrium
0
Fig. 80.
System in
motion
Mechanical system in Team Project 16
4.2 Basic Theory of Systems of ODEs
In this section we discuss some basic concepts and facts about systems of ODEs that are
quite similar to those for single ODEs.
The first-order systems in the last section were special cases ofthe more general system
y~
= flU. Y1' ...• )'n)
)'~
=
f2(t· ."1, .... ),,,)
(1)
We can write the system (I) as a vector equation by introducing the column vectors
Yn]T and f = [fl
fn]T (where T means transposition and saves us
the space that would be needed for writing y and f as columns). This gives
y = [."1
(1)
y' = fU. y).
This system (1) includes almost all cases of practical interest. For Il = I it becomes
)'~ = flU. )'1) or. simply, y' = f(t, y). well known to us from Chap. I.
A solution of (I) on some interval (/ < t < b is a set of n differentiable functions
),,, =
1I,,(t)
SEC. 4.2
137
Basic Theory of Systems of ODEs
on a < t < b that satisfy (1) throughout this interval. In vector form, introducing the
hnr (a column vector!) we can write
"solution I'ector" h = [hI
=
y
h(t).
An initial value problem for (I) consists of (1) and
11
given initial conditions
(2)
in vector form, y(to) = K, where to is a specified value of t in the interval considered and
Knr are given numbers. Sufficient conditions for the
the components of K = [Kl
existence and uniqueness of a solution of an initial value problem (I), (2) are stated in
the following theorem, which extends the theorems in Sec. 1.7 for a single equation. (For
a proof, see Ref. [A 7].)
Existence and Uniqueness Theorem
THEOREM 1
Let f 1, • . . , f n in (1) be continuousfwlctiolls havi1lg collfil1UOUS pm1illl derivatil'es
afl/aYI, ... , afl/aYn' ... , af,/iJYn in some domain R of f)·1.\"2 ••• Yn-space
containing tlte point (to, K I , . • . , K,,). Theil (I) has a solutioll 011 some illfen'al
to - a < t < to + a satisfying (2). and this solution is unique.
Linear Systems
Extending the notion of a linear ODE. we call (1) a linear system if it is linear in
Yl ... , Yn; that is, if it can be written
(3)
In vector form. this becomes
(3)
= Ay + g
y'
where
a~]
.,
y
ann
.,
g
=
[~]. .
gn
= 0, so that it is
y'
*
[h]
Yn
This system is called homogeneous if g
(4)
=
= Ay.
If g
0, then (3) is called nonhomogeneous. The system in Example I in the last section is
homogeneous and in Example 2 nonhomogeneous. The system in Example 3 is homogeneous
CHAP. 4
138
Systems of ODEs. Phase plane. Qualitative Methods
For a linear system (3) we have atl/aYI = an(t), ... , at nlaYn = ann(t) in Theorem l.
Hence for a linear system we simply obtain the following.
THEOREM 2
Existence and Uniqueness in the Linear Case
Let the ajk' sand g/ s in (3) be continuous functions of t on an open interval
a < t < f3 containing the point t = to. Then (3) has a solution y(t} on this inten'al
satisfying (2), and this solution is unique.
As for a single homogeneous linear ODE we have
THEOREM 3
Superposition Principle or Linearity Principle
lfy(1) and y(2) are solutions of the homogeneous linear system (4) on some interval,
so is any linear combination y = c l y(1) + C2y(2).
PROOF
Differentiating and using (4), we obtain
•
The general theory of linear systems of ODEs is quite similar to that of a single linear
ODE in Secs. 2.6 and 2.7. To see this, we explain the most basic concepts and facts. For
proofs we refer to more advanced texts, such as [A7J.
Basis. General Solution. Wronskian
By a basis or a fundamental system of solutions ofthe homogeneous system (4) on some
interval J we mean a linearly independent set of n solutions y(1}, ... , yCn) of (4) on that
interval. (We write J because we need I to denote the unit matrix.) We call a conesponding
linear combination
(5)
(CI, ... ,
Cn
arbitrary)
a general solution of (4) on J. It can be shown that if the ajk(t) in (4) are continuous on
J, then (4) has a basis of solutions on J. hence a general solution. which includes every
solution of (4) on J.
We can write n solutions ym, ... , yCn) of (4) on some interval J as columns of an
11 X 11 matrix
(6)
SEC. 4.3
Constant-Coefficient Systems. Phase plane Method
139
The determinant of Y is called the Wronskian of ym, ... , yen>, written
. I
. I
V(2)
_yCn)
1
v(1)
y~2)
y~n)
" (1)
.n
. n
,,(1)
(7)
W(yCl), ... , yCn»)
=
.2
V Cn )
\' (2)
. n
The columns are these solutions, each in terms of components. These solutions form a
basis on 1 if and only if W is not zero at any 11 in this interval. W either is identically zero
or is nowhere zero in 1. (This is similar to Sees. 2.6 and 3.l.)
If the solutions y(1), . . . , yen) in (5) form a basis (a fundamental system), then (6) is
cnl T ,
often called a fundamental matrix. Introducing a column vector e = [CI C2
we can now write (5) simply as
(8)
y
= Ye.
Furthermore, we can relate (7) to Sec. 2.6, as follows. If y and
second-order homogeneous linear ODE, their Wronskian is
W(y,.:)
=
v
y'
I
To write this ODE as a system, we have to set y = h, Y' = y~ =
(see Sec. 4.1). But then W(y, z) becomes (7), except for notation.
4.3
z are solutions of a
)'2
and similarly for z
Constant-Coefficient Systems.
Phase Plane Method
Continuing, we now assume that our homogeneous linear system
y'
(1)
= Ay
under discussion has constant coefficients, so that the n X n matrix A = [OjkJ has entries
not depending on t. We want to solve (I). Now a single ODE y' = ky has the solution
y = Ce kt • So let us try
(2)
Substitution into (I) gives y'
eigenvalue problem
(3)
Axe>"
= Ay =
Ax
= AX.
AxeAl. Dividing by eAt, we obtain the
140
CHAP. 4
Systems of ODEs. phase Plane. Qualitative Methods
Thus the nontrivial solutions of (1) (solutions that are not zero vectors) are of the form (2),
where A is an eigenvalue of A and x is a con·esponding eigenvector.
We assume that A has a linearly independent set of Il eigenvectors. This holds in most
applications, in particular if A is symmetric (okj = Ojk) or skew-symmetric (okj = -Ojk)
or has Il differellt eigenvalue~.
Let those eigenvectors be XCll, .... x(n) and let them correspond to eigenvalues
AI> ... , An (which may be all different, or some--or even all-may be equal). Then the
corresponding solutions (2) are
(4)
Their Wronskian W = W(yCll. . . . . yen»~ [(7) in Sec. 4.2] is given by
e
Xl
=
(y(1), ... ,
y(n»
=
X2
e
=
\: (n)e Ant
(1) Alt
Xn
e
(n) Ant
X2 e
(1) Alt
W
xil)
(n) Ant
(1) Alt
Xl
e
- n
eAlt
+ ..
+An t
X~l)
XU)
-n
yen)
-'n
On the right, the exponential function is never zero, and the determinant is not zero either
because its columns are the n linearly independent eigenvectors. This proves the following
theorem, whose assumption is true if the matrix A is symmetric or skew-symmetric, or if
the 11 eigenvalues of A are all different.
THEOREM 1
General Solution
If the constant matrix A in the system (I) has a linearly indepelldent set of Il
eigenvectors, then the corresponding solutions y(1), .•. ,y(n) in (4)for711 a basis of
solutiol1s of (l). olld the con'espollding general solution is
(5)
l __________________~
How to Graph Solutions in the Phase Plane
We shall now concentrate on systems (I) with constant coefficients con ~isting of two
ODEs
(6)
y'
=
Ay;
in components,
Of course, we can graph solutions of (6).
(7)
y(t) =
YI(t)] ,
[Y2(t)
SEC. 4.3
141
Constant-Coefficient Systems. phase Plane Method
as two curves over the t-axis, one for each component of ytf). (Figure 79a in Sec. 4.1 shows
an example.) But we can also graph (7) as a single curve in the )'lY2-plane. This is aporamefric
representation (parametric equation) with parameter t. (See Fig. 79b for an example. Many
more follow. Parametric equations al~o occur in calculus.) Such a CUI ve is called a trajectory
(or sometimes an orbit or path) of (6). The YIY2-plane is called the phase plane. 1 If we fill
the phase plane with trajectories of (6), we obtain the so-called phase portrait of (6).
E X AMP L E 1
Trajectories in the Phase Plane (Phase Portrait)
In order to see what is going on, let u, find and graph solutions of the system
I]
-3
(8)
y' = Ay =
Solution. By substituting y =
The characteristic equation is
I
[
xeAt
y.
thus
-3
yi =
y~
=
+
-3)'1
)'1 -
,1'2
3.1'2'
and y' = lLxe At and dropping the exponential function we get Ax = Ax.
det (A - AI) =
1-
3
- A
I
1=
I
,1.2
+ 6,1. +
8 = O.
-3 - A
Thi. gives the eigenvalue, Al = -2 and ,1.2 = -4. Eigenvectors are then obtained from
For Al = -2 this is
-xl
+
t2 = O. Hence we can take x(1) = 1I
and an eigenvector is x(2) = [I
JlT. For ,1.2
=
-4 this becomes Xl +
x2
= O.
_I]T. Tins gives the general solution
.
Figure 81 on p. 142 shows a phase pomait of some of the trajectories (to which more trajectories could be added
if so desired). The two straight trajectories correspond to Cl = 0 and C2 = 0 and the olhers to other choices of
~~
Studies of solutions in the phase plane have recently become quite important, along with
advances in computer graphics, because a phase portrait gives a good general qualitative
impression of the entire family of solutions. This method becomes particularly valuable
in the frequent cases when solving an ODE or a system is inconvenient or impossible.
Critical Points of the System (6)
The point y = 0 in Fig. 81 seems to be a common point of all trajectories, and we want
to explore the reason for this remarkable observation. The answer will follow by calculus.
Indeed, from (6) we obtain
,
(9)
."2
(21)'1
Y1
0U)'l
,
+ 022)'2
+ 012)'2
1A name that come, from physio. where il is the Y-(/IIv)-plane. used to plot a motion in terms of po,ition
= v (m = mass): but the name is now used quite generally for the YlY2-plane.
The use of the phase plane is a qualitatin method. a method of obtaining general qualitative information
on solutions without actually solving an ODE 01' a system. This method was created by HENRI POINCARE
(1854-1912). a great French mathematician, whose work was also fundamental in complex analysis, divergent
series, topology, and astronomy.
y and velocity /
CHAP. 4
142
Systems of ODEs. phase Plane. Qualitative Methods
This associates with every point P: (.vI' )'2) a unique tangent direction d.v2ldYl of the
trajectory passing through P, except for the point P = Po: (0.0), where the right side of
(9) becomes 0/0. This point Po, at which dY2idYl becomes undetermined. is called a critical
point of (6).
Five Types of Critical Points
There are five types of critical points depending on the geometric shape of the trajectories
near them. They are called improper nodes, proper nodes, saddle points, centers, and
spiral points. We define and illustrate them in Examples 1-5.
E X AMP L E 1
(Continued) Improper Node (Fig. 81)
An improper node is a critical point Po at which all the trajectories. except for two of them, have the same
limiting direction of the tangent. The two exceptional trajectories also have a limiting direction of the tangent
at Po which, however, is different.
The system (8) has an improper node at 0, as its phase portrait Fig. 81 shows. The common limiting direction
at 0 is that of the eigenvector xU) = [lIlT because e -4t goes to zero faster than e -2t as r increases. The two
exceptional limiting tangent directions are those of x(2) = [1 _l]T and -x(2) = [-I UT.
•
E X AMP L E 2
Proper Node (Fig. 82)
A proper node is a critical point Po at which every trajectory has a definite limiting direction and for any given
direction d at Po there is a trajectory having d as its limiting direction.
The system
(10)
y' =
,
[~
.\"1 =)'1
,
thus
.\"2 =)'2
has a proper node at the origin (see Fig. 82). Indeed, the matrix is the unit matrix. Its chmacteristic equation
0 is an eigenvector, and we can take [1 ol and [0 l]T. Hence
(I - ),)2 = 0 has the root)' = 1. Any x
a general solution is
'*
y
=
cl
[~J
t
e
+ c2
[~J
)"1 = cle
t
t
or
e
or
)'2
=
Y2
,"
"-
-
---1' '"
'"
/
/'
~
\
/
\
Fig. 81.
•
Y21
,\j~
~\/~
}!:-'
"-
c!Y2 = c2Vl·
C2 et
y
t2l
CtJ
Trajectories of the system (8)
(Improper node)
EO
Yj
;)'
~
~ \'
Fig. 82.
Trajectories of the system (10)
(Proper node)
Yj
SEC. 4.3
Constant-Coefficient Systems. phase plane Method
E X AMP L E 3
143
Saddle Point (Fig. 83)
A saddle point is a critical point Po at which there are two incoming trajeclOries.
all the other trajectories in a neighborhood of Po bypass Po.
The system
1
OJ
Y, = [0
-1
(11)
,
y.
Yl =
thus
TWO
outgoing trajeclOries. and
Yl
has a saddle point at the origin. Its characteristic equation (I - ,1.)( - I - A) = 0 has the roots ,1.1 = I and
,1.2 = -I. For A = I an eigenvector [l OlT is obtained from the second row of (A - AI)x = 0, that is.
OXI + (-I - I)x2 = O. For ,1.2 = -1 the fIrst row gives [0
Hence a general solution is
nT.
or
or
•
This is a family of hyperbolas (and the coordinate axes); see Fig. 83.
E X AMP L E 4
YIY2 = c{)nst.
Center (Fig. 84)
A center is a critical point that is enclosed by infinitely many closed trajectories.
The system
,
y'
(12)
= [
,vI = Y2
0
thus
-4
Y~ = -4Y1
has a center at the origin. The characteristic equation ,1.2 + 4 = 0 gives the eigenvalues 2i and -2i. For 2i an
eigenvector follows from the fIrst equation -2ixl + X2 = 0 of (A - AI)x = 0, say. [l 2ilT. For A = -2i that
equation is -(-2i)xl + X2 = 0 and gives. say. [I -2ilT. Hence a complex general solution is
(12*)
thus
The next step would be the transfOlmation of this solution 10 real form by the Euler formula (Sec. 2.2). But we
were just curious to see what kind of eigenvalues we obtain in the case of a center. Accordingly, we do not
continue. but start again from the beginning and use a shortcut. We rewrite the given equations in the form
yi. = )'2. 4"1 = -y~; then the product of the left sides must equal the product of the right sides.
By integration,
This is a family of ellipses (see Fig. 84) enclosing the center at the origin.
Fig. 83.
Trajectories of the system (11)
(Saddle point)
Fig. 84.
•
Trajectories of the system (12)
(Center)
144
E X AMP L E 5
CHAP. 4
Systems of ODEs. phase Plane. Qualitative Methods
Spiral Point (Fig. 85)
A spiral point is a critical point Po about which the trajectories spiral. approaching Po as t ---'> rx (or tracing
these spirals in the opposite sense. away from Po).
The system
[-I IJ
y' =
(13)
thus
y,
-I
-\
has a spiral point at the origin, as we shall see. The characteristic equation is A2 + 2A + 2 = O. It gives the
eigenvalues -\ + i and -\ - i. Corresponding eigenvectors are obtained from (-\ - A)XI + -"2 = O. For
A = -I + ; this becomes -if1 + X2 = 0 ami we can take [I ;]T as an eigenvector. Similarly. an eigenvector
corresponding to -1 - ; is [I _;]T. This gives the complex general solution
Y=
ci
[IJ
e(-I+i)t
+ c2
i
[IJ
e(-I-i)t .
-;
The next step would be the transformation of this complex solution to a real general solution by the Euler
fOlmula. But. as in the last example. we ju~t wanted to see what eigenvalues to expect in the ca,e of a spiIaI
point. Accordingly. we start again from the beginning and instead of that rather lengthy systematic calculation
we use a shortcut. We multiply the f,rst equation in (13) by YI. the second by Y2. and add. obtaining
We now introduce polar coordinates r. t. where r2 = ."12 + .'"22. Differentiating this with respect to t gives
21T' = 2YIY~ + 2Y2Y~' Hence the previous equation can be written
,
rr =
2
-r .
For each real c
drlr = -dt.
Thus.
thi~
is a spiral.
a~
Fig. 85.
E X AMP L E 6
In
Irl
= -t
+ c"'.
claimed. (see Fig. 85).
r
=
ce -t .
•
Trajectories of the system (13) (Spiral point)
No Basis of Eigenvectors Available. Degenerate Node (Fig. 86)
This cannot happen if A in (1) is symmetric (alQ = ajk. as in Examples 1-3) or skew-symmetnc (akj = -ajk.
thus ajj = 0). And ,t does not happen in many other cases (see Examples 4 and 5). Hence it suffices to explain
the method to be used by an example.
SEC 4.3
145
Constant-Coefficient Systems. phase plane Method
Find and graph a general
~olution
of
y'
(14)
Solution.
= Ay = [
4
-I
IJ
y.
2
A is not skew-symmetric! Its characteristic equation is
det (A - AI)
=
14 -
A
-I
I 1= ,1.2 - 6,1. + 9
2 - A
= (A - 3)2 = O.
It has a double root A = 3. Hence eigenvectors are obtained from (4 - ,1.).\"1 + x2 = O. thus from Xl + -'"2 = O.
say. x(l) = [I - J]T and nonzero multiples of it (which do not help). The method now is to substitute
with constant u = [Ill 112{ into (14). (TIle xT-term alone, the analog of what we did in Sec. 2.2 in the case of
a double rool. would nol be enough. Try iLl This gives
On the right. Ax = Ax. Hence the terms AXTe At cancel, and then division by
x+Au=Au,
Here A = 3 and x = [l
thus
eAt
gives
(A - Anu = x.
-nT. so that
4 - 3
(A - 31)u =
thus
[ -I
A solution. linearly independent of x = [l
_lJT. is u = [0
lJT. This yields the answer (Fig. 86)
The critical point at the origin is often called a degenerate node. c1y(l) gives the heavy straight line, with
> 0 the lower part and c1 < 0 the upper part of it.l2J give~ the right part of the heavy curve from 0 through
•
the second, first. and-finally-fourth quadrants. -l2) gives the other part of that curve.
("1
y
(2)
Y
Fig. 86.
Degenerate node in Example 6
CHAP. 4
146
Systems of ODEs. Phase plane. Qualitative Methods
We mention that for a system (1) with three or more equations and a triple eigenvalue
with only one linearly independent eigenvector, one will get two solutions. as just
discussed, and a third linearly independent one from
with v from
11-91 GENERAL SOLUTION
Find a real general solutiun of the following systems. (Show
the details.)
2. y~
,
Y2
3. Y ~
= Yl
+
4. Y~
)"2
9Yl
,
6. y~
2Yl
.v 2
2Yl
,
Y2
+
1.5)'1
Y2
,
+ 13.5Y2
9Y2
2)"2
+
2."2
!16-171
Y~ = -4Yl - 10Y2
+
CONVERSION
16. The system in Prob. 8.
17. The system in Example 5.
18. (Mixing problem, Fig. 87) Each of the two tanks
contains 400 gal of water. in which initially 100 lb
(Tank T l ) and 40 Ib (Tank T2 ) of fertilizer are
dissolved. The intlow, circulation, and outflow are
shown in Fig. 87. The mixture is kept uniform by
stirring. Find the fertilizer contents Yl(t) in T] and Y2(t)
in T2 .
(P
2Y3
+ Av = Av.
Find a general solution by conversion to a single ODE.
4S . . .a lJ......
}, ----;-
7. y~
u
Nat
..........TJ
y~ = -4.\"1 - 4.\"2 - 4)"3
'8 --"
T2
non
----;- I.
----;-
Fig. 87.
Tanks in Problem 18
---
19. (Network) Show that a model for the currents I] (1) and
12 (t) in Fig. 88 is
y~
0.IY2
=)"1 -
+
1.4Y3
110-151 INITIAL VALUE PROBLEMS
Solve the following initial value problems. (Show the details.)
10. Y ~ = Y1
+
)"2
y~
y~ = 4Yl + Y2
6
=
~Yl + .\'2
."1(0) = 16, )"2(0) = -2
Find a general solution, assuming that R = 20 .0,
L = 0.5 H, C = 2' 10-4 F.
20. CAS PROJECT. Phase Portraits. Graph some of the
figures in this section, in particular Fig. 86 on the
degenerate node. in which the vector y(2J depends on
t. In each figure highlight a trajectory that satisfies an
initial condition of your choice.
c
R
14. )"~
=
-Yl
+
5Y2
y~ = -Yl + 3Y2
15. Y~
=
Y;
=
2Yl
5Yt
+
+
5Y2
L
l2.5Y2
Fig. 88.
LI
-Y
Network in Problem 19
SEC. 4.4
Criteria for Critical Points. Stability
147
4.4 Criteria for Critical Points. Stability
We continue our discussion of homogeneous linear systems with constant coefficients
(1) y'
(lU
= Ay =
in components,
[
(121
From the examples in the last section we have seen that we can obtain an overview of
families of solution curves if we represent them parametrically as yet) = lVI(t) Y2(t)]T
and graph them as curves in the YLv2-plane, called the phase plane. Such a curve is called
a trajectory of (I), and their totality is known as the phase portrait of (I).
Now we have seen that solutions are of the form
Substitution into (1) gives
Dropping the common factor
eAt,
we have
Ax = Ax.
(2)
Hence yet) is a (nonzero) solution of (1) if A is an eigenvalue of A and x a corresponding
eigenvector.
Our examples in the last section show that the general form of the phase portrait is
determined to a large extent by the type of critical point of the system (1) defined as a
point at which dY2/dYI becomes undetermined, DID; here [see (9) in Sec. 4.3]
(l21YI
(3)
(lUYI
+
+
(l22Y2
1112Y2
We also recall from Sec. 4.3 that there are various types of critical points, and we shall
now see how these types are related to the eigenvalues. The latter are solutions A = Al
and A2 of the characteristic equation
(4)
det (A - AI) =
I(lU - A
al2
U21
(122 - A
This is a quadratic equation A2 - pA
given by
+
I = A2 -
(au + a22)A + det A = O.
q = 0 with coefficients p. q and discriminant D.
From calculus we know that the solutions of this equation are
(6)
Furthermore, the product representation of the equation gives
148
CHAP. 4
Systems of ODEs. Phase plane. Qualitative Methods
Hence]J is the sum and q the product of the eigenvalues. Also Al - A2
Together.
= VA from (6).
(7)
This gives the criteria in Table 4.1 for classifying critical points. A derivation will be
indicated later in this section.
Table 4.1
I
Eigenvalue Criteria for Critical Points
(Derivation after Table 4.2)
Name
-
(a) Node
(b) Saddle point
I
(c) Center
(d) Spiral point
p = Al
+
p=O
p=l=O
A2
q
= AIA2
q>O
q<O
q>O
!:J.
= (AI ~~O
~
<0
A2)2
Comments on A1 • A2
Real, same sign
Real, opposite sign
Pure imaginary
Complex. not pure
imaginary
Stability
Critical points may also be classified in terms of their stability. Stability concepts are basic
in engineering and other applications. They are suggested by physics. where stability
means, roughly speaking, that a small change (a small disturbance) of a physical system
at some instant changes the hehavior of the system only slightly at all future times t. For
critical points, the following concepts are appropriate.
DEFINITIONS
Stable, Unstable, Stable and Attractive
A critical point Po of (\) is called stable 2 if, roughly, all trajectories of (\) that at
some instant are close to Po remain close to Po at all future times: precisely: if for
every disk D" of radius E > 0 with center Po there is a disk D t; of radius 8 > 0 with
center Po such that every trajectory of (l) that has a point PI (corresponding to
t = t 1 , say) in Dt; has all its points corresponding to t ~ t1 in DE" See Fig. 89.
Po is called unstable if Po is not stable.
Po is called stable and attractive (or asymptotically stable) if Po is stable and
every trajectory that has a point in DB approaches Po as t ~ x. See Fig. 90.
Classification criteria for critical points in terms of stability are given in Table 4.2. Both
tables are summarized in the stability chart in Fig. 91. In this chart the region of instability
is dark blue.
21n the sense of the Russian mathematician ALEXANDER MICHAILOVICH LJAPUNOV (1857-1918),
whose work was fundamental in stability theory for ODEs. This is perhaps the most appropriate defmition of
stability (and the only we shall lise), but there are others, too.
SEC 4.4
Criteria for Critical Points. Stability
149
Fig. 89. Stable critical point Po of (1) (The trajectory
initiating at P1 stays in the disk of radius E.)
Table 4.2
Fig. 90.
Stable and attractive critical
point Po of (1)
Stability Criteria for Critical Points
Type of Stability
p
(a) Stable and attractive
(b) Stable
(e) Unstahle
Al
=
+
q
A2
p<O
AIA2
--
q>O
q>O
q<O
p~O
p>O
=
OR
-
q
Saddle
point
p
Fig. 91. Stability chart of the system (1) with p, q, ~ defined in (5).
Stable and attractive: The second quadrant without the q-axis.
Stability also on the positive q-axis (which corresponds to centers).
Unstable: Dark blue region
We indicate how the criteria in Tables 4.1 and 4.2 are obtained. If q = Al A2 > 0,
both eigenvalues are positive or both are negative or complex conjugates. If also
p = Al + A2 < O. both are negative or have a negative rea] part. Hence Po is stable
and attractive. The reasoning for the other two lines in Table 4.2 is similar.
If fj, < 0, the eigenvalues are complex conjugates, say, Al = a + i{3 and A2 = a - i{3.
If also p = Al + A2 = 20' < 0, this gives a spiral point that is stable and attractive. If
p = 20' > 0, this gives an unstable spiral point.
If p = 0, then A2 = -AI and q = AIA2 = -AI2. If also q > 0, then A12 = -q < 0,
so that AI, and thus A2, must be pure imaginary. This gives periodic solutions, their
trajectories being closed curves around Po, which is a center.
E X AMP L E 1
Application ofthe Criteria in Tables 4.1 and 4.2
In Example I, Sec. 4.3, we have y' = [-:
lJ y,p = -6, q = 8,
-3
is stable and attractive by Table 4.2(a).
~ = 4, a node by Table 4.1(a), which
•
CHAP. 4
150
E X AMP L E 2
Systems of ODEs. phase plane. Qualitative Methods
Free Motions of a Mass on a Spring
What kind of critical point does my" + 0" + ky
=
Solution. Division by m gives y" = -(kIm))" 4.1). Then)'~ = y" = -(klm)Yl - (elm)Y2' Hence
y' =
0
[
I ] y
-kIm
-elm
det
'
(A -
0 in Sec. 2.4 have?
(elm)y'. To get a system, set
AI) =
I
)'1 =
y, Y2 = Y' (see Sec.
-A
-kIm
We see thatp = -elm. q = kIm, l:!. = (elm)2 - 4klm. From this and Tables 4.1 and 4.2 we obtain the following
results. Note that in the last three cases the discriminant l:!. plays an essential role.
No damping. c = 0, p = 0, q > 0, a center.
UnderdaJllping. c 2 < 4mk, p < O. q > O. ~ < 0, a stable and attractive spiral point.
Critical damping. c 2 = 4mk. p < O. q > O. l:!. = O. a stable and attractive node.
Overdamping. c 2 > 4mk, p < 0, q > 0, l:!. > 0, a stable and attractive node.
--
.•.... -.-....
11-91
...
.. ..
... .-..---.......
--.---~
TYPE AND STABILITY OF CRITICAL POINT
Determine the type and stability of the critical point. Then
find a real general solution and sketch or graph some of the
trajectories in the phase plane. (Show the details of your
work.)
1.
,
,
Yl
2.
2)'2
Y2
3. Yl = 2Yl
+
Y2
.y 1
+
2)'2
,
Y2 =
,
,
5. Yl
Y2
)'1 -
,
,
4Y2
Y2 =
9. Y~ =
,
Y2
6. Yl
,
-5)'1
Yl
+
-
2)'2
IOY2
Y2 = 7)'1
,
3)'1
8Y2
+
Y2 = -5)'1 -
8Y1
+
2)'2
y~ = 2Yl
+
Y2
5Y2
3Y2
FORM OF TRAJECTORIES
What kind of curves are the trajectories of the following
ODEs in the phase plane?
10. y"
+
11. y" -
12. Y"
+
5)"
=
0
k 2y
=
0
14. (Transformation of variable) What happens to the
system (1) and its critical point if you introduce
as a new independent variable?
T
= -t
15. (Types of critical points) Discuss the critical points in
(10)-( 1-1-) in Sec. 4.3 by applying the criteria in Tables
4.1 and 4.2 in tlus section.
16. (Perturbation of center) If a system has a center as
3Y2
4. Y~ = Y2
,
)'1
110-121
=
8. Yl
-2Y2
7. Yl
4Yl
,
+
-4Yl
Y2
,
)'1
,
Y2 = 8y!
,
•
tBY = 0
13. (Damped oscillation) Solve y" + 4y' + 5y = O. What
kind of curves do you get as trajectories?
its critical point, whal happens if you replace the matrix
A by A = A + kI with any real number k =1= 0
(representing measurement errors in the diagonal
entries)?
17. (Perturbation) The system in Example 4 in Sec. 4.3
has a center as its critical point. Replace each Gjk in
Example 4, Sec. 4.3, by ajk + b. Find values of b such
that you get (a) a saddle point. (b) a stable and attractive
node, (c) a stable and attractive spiral. (d) an unstable
spiral, (e) an unstable node.
18. CAS EXPERIMENT. Phase Portraits. Graph phase
portraits for the systems in Prob. 17 with the values of
b suggested in the answer. Try to illustrate how the phase
portrait changes "continuously" under a continuous
change of b.
19. WRITING
EXPERIMENT. Stability. Stability
concepts are basic in physics and engineering. Write a
two-part report of 3 pages each (A) on general
applications in which stability plays a role (be as
precise as you can), and (B) on material related to
stability in this section. Use your own formulations and
examples: do not copy.
20. (Stability chart) Locate the critical points of the
systems (0)-(14) in Sec. 4.3 and of Probs. 1,3,5 in
this problem set on the stability chart.
151
SEC. 4.5
Qualitative Methods for Nonlinear Systems
4.5
Qualitative Methods for Nonlinear Systems
Qualitative methods are methods of obtaining qualitative information on solutions
without actually solving a system. These methods are particularly valuable for systems
whose solution by analytic methods is difficult or impossible. This is the case for many
practically important nonlinear systems
y'
(1)
=
fey),
thus
Y~ = fl(Yl, Y2)
Y~
=
f 2(Yb Y2)·
In this section we extend phase plane methods, as just discussed, from linear systems
to nonlinear systems (1). We assume that (1) is autonomous, that is, the independent
variable t does not occur explicitly. (All examples in the last section are autonomous.)
We shall again exhibit entire families of solutions. This is an advantage over numeric
methods, which give only one (approximate) solution at a time.
Concepts needed from the last section are the phase plane (the YIY2-plane), trajectories
(solution curves of (1) in the phase plane), the phase portrait of (1) (the totality of these
trajectories), and critical points of (1) (points (Yb .\'2) at which both fl()'b )'2) and f2(Yb )'2)
are zero).
Now (1) may have several critical points. Then we discuss one after another. As a
technical convenience, each time we first move the critical point Po: (a, b) to be considered
to the origin (0, 0). This can be done by a translation
which moves Po to (0, 0). Thus we can assume Po to be the origin (0, 0), and for
simplicity we continue to write .\'1' Y2 (instead of 511, )'2). We also assume that Po is
isolated, that is, it is the only critical point of (1) within a (sufficiently small) disk with
center at the origin. If (1) has only finitely many critical points, this is automatically
true. (Explain!)
Linearization of Nonlinear Systems
How can we determine the kind and stability property of a critical point Po: (0, 0) of
(1)? In most cases this can be done by linearization of (1) near Po. writing (1) as
y' = fey) = Ay + hey) and dropping hey), as follows.
Since Po is critical, fl(O. 0) = 0, f2(0, 0) = O. so that fl and f2 have no constant terms
and we can write
(2)
y'
= Ay + hey),
thus
A is constant (independent of t) since (1) is autonomous. One can prove the following
(proof in Ref. [A7], pp. 375-388, listed in App. 1).
CHAP. 4
152
Systems of ODEs. Phase Plane. Qualitative Methods
Linearization
THEOREM 1
If f 1 and f 2 ill (1) are comillllolls alld have contilluollS partial derivatives ill a
neighborhood of the critical point Po: (0, 0). alld !f det A =1= 0 in (2), then the kind
and stability of the critical poillt of (1) {Ire the same as those of the linearized
system
(3)
,
,
Ay,
Y
.r I
=
aIL"l
+ (/12Y2
Y2
=
a21Yl
+ a22Y2'
,
thus
Exceptions occllr!f A has equal or pure i111agilwry eigellvalues; then (1) may have
the same kind of critical point as (3) or a spiral point.
E X AMP L E 1
Free Undamped Pendulum. Linearization
Figure 91a shows a pendulum comisting of a body of mass 111 (the bob) and a rod of length L. Detenmne the
locations and types of the critical points. Assume that the mass of the rod and air resistance are negligible.
Solutioll.
Step 1. Settillg lip the mathematical model. Let () denote the angular displacement, measured
counterclockwise from the equilibrium position. The weight of the bob is mg (g the acceleration of gravity). It
causes a restoring force IIlg sin () tangent to the curve of motion (circular arc) of the bob. By Newton's second
law. at each instant this force is balanced by the force of acceleration mL()", where L()" is the acceleration:
hence the resultant of these two forces is zero. and we obtain as the mathematical model
IIlL()" + IIlg sin () = O.
Dividing this by mL. we have
()" + k sin ()
(4)
= 0
When () is very small. we can approximate sin () rather accurately by () and obtain as an approximate solution
A cos V kt + B sin Vkt. but the exact solution for any () is not an elementary function.
Step 2. Critical po;"ts (0, 0), ±(2rr, 0), ±l4rr, 0), ... , Lilleari;;.atioll. To obtain a system of ODEs. we set
() = .1"1' ()' = )"2' Then from (4) we obtain a nonlinear system (I) of the form
Y~
= hlYl, Y2) = Y2
y~ = .12(.1"1 . .1"2) = -k sinY1·
The right sides arc both zero when .1'2 = 0 and sin.\"1 = O. This gives infinitely many critical points
where Il = O. ± I. ±2, .... We consider (0, 0). Since the Maclaurin series is
(111T.
0).
sin."1 = .\'1 - ~Y13 + - ... = .\'1'
the linearized system at (0. 0) i,
y' = Ay = [
0
thus
-k
To apply our criteria in Sec. -lA we calculate p = all + a22 = 0, q = det A = k = gIL (> 0), and
j. = p2 - 4q = -4k. From this and Table 4.1 (c) in Sec. 4.4 we conclude that (0. 0) is a center. which is always
stable. Since sin (j = sinYI is periodic with period 11T. the critical points (/l1T. 0), /I = ±1. ±4..... are all
centers.
Step 3. Critical poillts ±(rr. 0). ±(3rr. 0), ±(5rr. 0) •.. '. Lilleari:.atioll. We now consider the critical point
(1T. 0). setting () - 1T = Yl and «() - 1T)' = ()' = .\'2' Then in (4).
sin ()
=
sin 1.\'1 + 1Tl = -sin Yl = -Yl + ~YI 3
-
+ ... = -\'1
SEC. 4.5
153
Qualitative Methods for Nonlinear Systems
and the linearized system at (7T, 0) is now
thus
We see that p = 0, q = - k « 0), and D. = -4q = 4k. Hence, by Table 4.1(b), this gives a saddle point, which
is always unstable. Because of periodicity, the critical points (/17T, 0), /1 = ::'::1, ::'::3, .. " are all saddle points.
•
These results agree with the impression we get from Fig. Y2b.
mg
(a) Pendulum
(b) Solution curvesY2(Yj) of (4) in the phase plane
Fig. 92.
EXAMPLE 2
Example 1 (C will be explained in Example 4.)
Linearization of the Damped Pendulum Equation
To gain further experience in investigating critical points, as another practically imponant case. let us see how
Example I changes when we add a damping term ce' (damping proportional to the angular velocity) to equation
(4), so that it becomes
e" + ce'
(5)
+ k sin
e=
0
where k > 0 and c ::::; 0 (which includes Ollr previous casc of no damping, c = o1. Setting
before, we obtain the nonlinear system (use e" = )'~)
e=
-"1,
e' = Y2'
as
,
Y1
=)'2
y~ = -k sinYI - cY2.
We see that the critical poinrs have the same locations as before. namely. (0, 0). (::'::7T. 0), (::'::27T. 0), .... We
consider (0, 0). Linearizing sin Yl = )'1 as in Exan1ple 1, we get the linearized system at (0, 0)
,
(6)
y' = Ay = [0
-k
)'1 =
lJ Y
-c
Y2
thus
'
This is identical with the system in Example 2 of Sec 4.4, except for the (positive!) facror I1l (and except for
the physical meaning of Yl)' Hence for c = 0 (no damping) we have a center (see Fig. 92b). for small damping
we have a spiral point (see Fig. 93), and so on.
We now consider the critical point (7T, 0). We set e - 7T = Yl, (e - 7T/ = e' = Y2 and linearize
sin
e ~ sin(YI
+
7T) = - sinYl = -y!.
This gives the new linemized system at (7T, 0)
,
(6*)
y'=AY=[O
k
IJy,
-c
thus
Yj = Y2
154
CHAP. 4
Systems of ODEs. Phase plane. Qualitative Methods
For our criteria in Sec 4.4 we calculate p = au + a22 = -c. q = det A
This gives the following results for the critical point at (1T, 0).
= -k, and D.
=
p2 - 4q = c 2 + 4k.
No damping. c = 0, p = O. q < 0, ~ > O. a saddle point. See Fig. 92b.
Damping. c > 0, p < O. q < 0, D. > O. a saddle point. See Fig. 93.
Since sin ,1'1 is periodic with period 21T, the critical points (:+:21T, 0), (:+:41T, 0), ... are of the same type a~
(0.0). and the critical points (-1T, 0), (:+:31T. 0), ... are of the same type as (1T. 0), so that our task is finished.
Figure 93 shows the trajectories in the case of damping. What we see agrees with our physical intuition. Indeed.
damping means loss of energy. Hence instead of the closed trajectories of periodic solutions in Fig. 92b we now
have trajectories spiraling around one of the critical points (0, 0), (:!:21T, 0), .... Even the wavy trajectorie~
corresponding to whirly motions eventually spiral around one of these points. Furthermore. there are no more
trajectories that connect critical points (as there were in the undamped case for the saddle points).
•
Fig. 93.
Trajectories in the phase plane for the damped pendulum
in Example 2
Lotka-Volterra Population Model
E X AMP L E 3
Predator-Prey Population Model3
This model concerns two species, say, rabbits and foxes, and the foxes prey on the rabbits.
Step 1. Setting lip the model. We assume the following.
1. Rabbits have unlimited food supply. Hence if there were no foxes. their number Yl(t) would grow
exponentially. \'~ = ay!.
2. Actually, Yl is decrea,ed because of the kill by foxes. say. at a rate proportional to YIY2' where
the number of foxes. Hence y~ = aYl - bYIY2, where a > 0 and b > O.
)'2(t)
is
3. If there were no rabbits, then Y2(f) would exponentially decreaSe to zero, y~ = -1.\'2' However, Y2 is
increased by a rate proportional to the number of encounters between predator and prey; together we
have Y~ = - IY2 + kyl)'2' where k > 0 and 1 > O.
This gives the (nonlinear!) Lotka-Volterra system
(7)
Y; =
h(Yl,
)'2) =
aYl - bylY2
,1'2 = f 2(Yl,.\'2) = /"'}'1.'·2
-
lY2 .
3lntroduced by ALFRED 1. LOTKA (1880-1949), Amelican biophySicist. and VITO VOLTERRA
(1860-1940), Italian mathematician, the initiator of functional analysis (see [GR7] in App. 1).
SEC. 4.5
Qualitative Methods for Nonlinear Systems
155
Step 2. Critical point (0, 0), Linearization. We see from (7) that the critical pomts are the solutIOns of
(7*)
I a
The solutions are (Yl, Y2) = (0,0) and (k' b)' We consider (0, 0). Dropping -bYIY2 and kYIY2 from
0) gives
the linearized system
OJ y.
-I
Its eigenvalues are Al
=
a > 0 and ,1,2
=
-I
< O. They have opposite signs, so that we get a saddle point.
Step 3. Critical point (lIk, alb), Linearization. We set Yl = Yl + Ilk, Y2 = 3'2 + alb. Then the critical point
(Ilk, alb) corresponds to (Yl, Y2) = (0,0). Since Y; = y;, y~ = y~, we obtain from 0) [factorized as in (8)]
yi =
(Yl
+
±)
y~
(Y2
+
i) [k(S\ + ±) - lJ
=
[a - b(Y2
+
~) J =
(Yl
±)
+
(3'2 +
(-byv
i )k)\.
Dropping the two nonlinear terms -bYIY2 and "5'IY2, we have the linearized system
~,
(a)
Yl = -
(b)
Y2=
Ib
T
~
Y2
(7**)
~,
ak
b ''t.
The left side of (a) times the right side of (b) must equal the right side of (a) times the left side of (b),
ak ~
By integration.
bYI
2
lb ~
+ T)'2
2
=
const.
This is a family ellipses, so that the critical point (Ilk, alb) of the linearized system 0**) is a center (Fig. 94).
It can be shown by a complicated analysis that the nonlinear system (7) also has a center (rather than a spiral
point) at (Ilk, alb) surrounded by closed trajectories (not ellipses).
We see that the predators and prey have a cyclic variation about the critical point. Let us move counterclockwise
around the ellipse, beginning at the right vertex, where the rabbits have a m1L'dmum number. Foxes are sharply
increasing in number until they reach a maximum at the upper vertex, and the number of rabbits is then sharply
decreasing until it reaches a minimum at the left vertex, and so on. Cyclic variations of this kind have been
observed in nature, for example, for lynx and snowshoe hare near the Hudson Bay, with a cycle of about 10
years.
For models of more complicated situations and a systematic discussion, see C. W. Clark, Mathematical
Bioeconolllics (Wiley, 1976),
•
Y2
~
b
~
------~
I
k
Ecological equilibrium pOint and trajectory
of the linearized Latka-Volterra system (7**)
Fig. 94.
156
CHAP. 4
Systems of ODEs. phase plane. Qualitative Methods
Transformation to a First-Order Equation
in the Phase Plane
Another phase plane method is based on the idea of transforming a second-order
autonomous ODE (an ODE in which t does not occur explicitly)
F()'.
to first order by taking),
y" by the chain rule,
y'. -,"") = 0
= )"1 as the independent variable, setting y' = )'2 and transforming
)'
dY2 dy!
" =)'2, =
dy!
dt
Then the ODE becomes of first order,
(8)
and can sometimes be solved or treated by direction fields. We illustrate this for the
equation in Example I and shall gain much more insight into the behavior of solutions.
E X AMP L E 4
An ODE (8) for the Free Undamped Pendulum
If in (4) 6"
+k
sin 6 = 0 we set 6 = .\"1. 6' = .1"2 (the angular velocity) and use
6"
dY2
dt
=-
dY2
((VI
dYl
dt
we get
Separation of variables gives.l'2 dY2 = -k sin Yl elYl' By integration.
(9)
ll'22 = k cosYl +
e
(e constant).
Multiplying this by mL2. we get
We see that these three terms are energies. Indeed. Y2 is the angular velocity. so that LY2 is the velocity and the
tirst term b the kinetic energy. The ~ecoml term (including the minus sign) is the potential energy of the pendulum.
2
and mL e is its total energy, which is constant, as expected from the law of conservation of energy, because
there is no damping (no loss of energy). The type of motion depends on the total energy. hence on C. as follows.
Figure 92b on p. 153 shows trajeclOries for various values of C. These graphs continue periodically with
period 27TtO the left and to the right. We see that some of them are ellipse-like and closed. others are wavy,
and there are two trajectories (passing through the saddle points (1/7T. 0). n = ::':: I. ::'::3.... I that ~eparate
those two types of trajectories. From (9) we see that the smallest possible e is e = -k; then.l"2 = 0, and
cos VI = I. so that the pendulum is at rest. The pendulum will change its direction of motion if there are points
at which Y2 = e' = O. Then k cos Yl + e = 0 by (9). If,vl = 7T, then cos .1'1 = - I and e = k. Hence if
-J.. < e < k, then the pendulum reverses its direction for a IYll = lei < 7T. and for these values of e with
lei < Ii: the pendulum oSl:iIlates. This corresponds to the closed trajectories in the figure. However. if e > k,
then Y2 = 0 is impossible and the pendulum makes a whirly motion that appears as a wavy trajectory in the
YIY2-plane. Finally. the value e = k correspond, to the two "separating trajectories" in Fig. 92b connecting the
saddle points.
•
The phase plane method of deriving a single first-order equation (8) may be of practical interest
not only when (8) can be solved (as in Example 4) but also when solution is not possible and
we have to utilize direction fields (Sec. 1.2). We illustrate this with a very famous example:
SEC. 4.5
Qualitative Methods for Nonlinear Systems
E X AMP L E 5
157
Self-Sustained Oscillations. Van der Pol Equation
There are physical systems such that for small oscillations, energy is fed into the system, whereas for large
oscillations. energy is taken from the ~y~tem. In o!her words, large oscillations will be damped, whereas for
small oscillations there is "negative damping" (feeding of energy into the system). For physical reason~ we
expect such a system to approach a periodic behavior, which will thus appear as a closed trajectory in the phase
plane. called a limit cycle. A differential equation describing such vibrations is the famous van der Pol
4
equation
(M > 0, constant).
(10)
It first occurred in the study of electrical circuits containing vacuum tubes. For M = 0 this equation becomes
Y" + ." = 0 and we obtain harmonic oscillations. Let M > O. The damping term has !he factor -M(I _ y2).
This is negative for small oscillation~, when y2 < I. so that we have "negative damping," is .lero for y2 = I (no
damping), and is positive if)'2 > 1 (positive damping, loss of energy). If M is small, we expect a limit cycle
that is almost a circle because then our equation differs bm little from y" + )' = O. If M is large. the limit
cycle will probably look different.
Setting y = ."1, y' = )'2 and using y" = (dY2/dYl)Y2 as in (8), we have from (10)
(II)
The isoclines in the YIY2-plane (the phase plane) are the curves dY2/dYl
~
K
=
consf,
thaI is,
Solving algebraically for Y2, we see that the isoclines are given by
Yl
(Figs. 95, 96).
Y2
5 K=O
K=-l
K= l
4
K= 1
K=-5
/
K=l
K=-5
K=-l
-5
Fig. 95. Direction field for the van der Pol equation with fL = 0.1 in the phase plane,
showing also the limit cycle and two trajectories. See also Fig. 8 in Sec. 1.2.
4BALTHASAR VAN DER POL (I 88<j-I<jS9), Dutch physicist and engineer.
CHAP. 4
158
Systems of ODEs. Phase plane. Qualitative Methods
Figure 9S shows some isoclines when fL is small, f.L = 0.1. the limit cycle (almost a circle), and two (blue)
trajectories approaching it. one from the outside and the other from the inside. of which only the initial portion,
a small spiral, is shown. Due to this approach by trajectories. a limit cycle differs conceprually from a closed
curve (a trajectory) surrounding a center, which is not approached by trajectories. For larger fL the limit cycle
no longer resembles a circle, and the trajectories approach it more rapidly than for smaller fL. Figure 96 illustrates
this for fL = I.
•
K=D
K=-l
K=-l
K= 1
\
K=O
,,/'"
K=-5
-3
K=O
K=-5
K=O
K= 1
K=-l
Fig. 96.
.
CRITICAL POINTS, LINEARIZATION
9
Detennine the location and type of all critical points by
linearization. In Probs. 7-12 first transform the ODE to a
system. (Show the details of your work.)
1. y~ = Y2
3. y~
2.
)"22
5. )' ~
-YI
,
+
Y
. 2
\'
.2
-
+
.\' 1
2
+ Y2
- Y2
y~ = YI - 3Y2
2
Y2
)'2 -
2
6. y~ = Y2 - Y2 2
y~ = Yl - Y1 2
-Yl - Y2
Y2
4\'
. 1
4. y~ = -3Yl
2Yl - h
.\'2
,
•\' 1
,. ,
4Y2
,
7. y"
+
o
K=-l
Direction field for the van der Pol equation with IL = 1 in the phase plane.
showing also the limit cycle and two trajectories approaching it
8. y" + 9)' + y2
o
2
\'
" + cos y
Ll. Y"
+ 4)' -
0
10. y"
y3 = 0
12. Y"
=
+ sin y =
+ Y' + 2}' -
0
y2 = 0
13. (Trajectories) What kind of curves are the trajectories
of -,~y" + 2/ 2 = O?
14. (Trajectories) Write the ODE y" - 4y + )'3 = 0 as a
system. ~llive it for Y2 as a function of ."1. and sketch
or graph some of the trajectories in the phase plane.
15. (Trajectories) What is the radius of a real general
solution of y" + Y = 0 in the phase plane?
16. (Trajectories) In Prob. 14 add a linear damping tenn
to get y" + 2y' - 4y + y3 = O. Using arguments from
mechanics and a comparison with Prob. 14, as well as
with Examples I and 2. guess the type of each critical
point. Then determine these types by linearization.
(Show all details of your work.)
SEC. 4.6
17. (Pendnlum) To what state (position, speed, direction
of motion) do the four points of intersection of a
closed trajectory with the axes in Fig. 92b correspond?
The point of intersection of a wavy curve with the
Y2- axis ?
18. (Limit cycle) What is the essential difference between
a limit cycle and a closed trajectory surrounding a
center?
19. CAS EXPERIMENT. Deformation of Limit Cycle.
Convert the van der Pol equation to a system. Graph
the limit cycle and some approaching trajectories for
fL = 0.2,0.4,0.6, 0.8, 1.0, l.5, 2.0. Try to observe how
the limit cycle changes its form continuously if you
vary IL continuously. Describe in words how the limit
cycle is deformed with growing fL.
20. TEAM PROJECT. Self-sustained oscillations.
(a) Van der Pol Equation. Determine the type of the
critical point at (0, 0) when IL > 0, IL = 0, IL < O.
4.6
159
Nonhomogeneous Linear Systems of ODEs
Show that if IL -) 0, the isoclines approach straight
lines through the origin. Why is this to be expected?
(b) Rayleigh equation. Show that the so-called
Rayleigh equation5
y" - IL(I - §y'2)y' + Y
= 0
(IL> 0)
also describes self-sustained oscillations and that by
differentiating it and setting y = y' one obtains the van
der Pol equation.
(c) Duffing equation. The Duffing equation is
y"
+
wo2 y
+ f3y 3
=
0
where usually 1f31 i~ small. thus characterizing a small
deviation of the restoring force from linearity. f3 > 0
and f3 < 0 are called the cases of a hard spring and a
soft spring, respectively. Find the equation of the
trajectories in the phase plane. (Note that for f3 > 0 all
these curves are closed.)
Nonhomogeneous Linear Systems of ODEs
In this last section of Chap. 4 we discuss methods for solving nonhomogeneous linear
systems of ODEs
(1)
y' = Ay
+
g
(see Sec. 4.2)
where the vector g(t) is not identically zero. We assume g(t) and the entries of the 11 X II
matrix A(t) to be continuous on some interval 1 of the t-axis. From a general solution
y(h)(t) of the homogeneous system y' = Ay on J and a particular solution y(P)(t) of
(1) on J [i.e., a solution of (1) containing no arbitrary constants], we get a solution
of (l),
(2)
y is called a general solution of (I) on 1 because it includes every solution of (l) on 1.
This follows from Theorem 2 in Sec. 4.2 (see Prob. 1 of this section).
Having studied homogeneous linear systems in Secs. 4.1-4.4. our present task will be
to explain methods for obtaining particular solutions of (I). We discuss the method of
undetermined coefficients and the method of the variation of parameters; these have
counterparts for a single ODE, as we know from Secs. 2.7 and 2.10.
5 LORD RAYLEIGH (JOHN WILLIAM STRUTI) (1842-1919). great English physicist and mathematician.
professor at Cambridge and London. known by his important contributions to the theory ot waves, elasticity
theory. hydrodynamics. and various other branches of applied mathematics and theoretical physics. In 1904 he
received the Nobel Prize in physics.
160
CHAP. 4
Systems of ODEs. Phase plane. Qualitative Methods
j
Method of Undetermined Coefficients
As for a single ODE, this method is suitable if the entries of A are constants and the
components of g are constants, positive integer powers of t, exponential functions, or
cosines and sines. In such a case a particular solution yep) is assumed in a fonn similar
to g; for instance, y(P) = U + vt + wt2 if g has components quadratic in t, with u, v, w
to be determined by substitution into (I). This is similar to Sec. 2.7, except for the
Modification Rule. It suffices to show this by an example.
E X AMP L E 1
Method of Undetermined Coefficients. Modification Rule
Find a general solution of
Solution.
A general equation of the homogeneous system is (see Example I in Sec. 4.3)
Since A = -2 is an eigenvalue of A, the function e- 2t on the right also appears in yChl, and we must apply the
Modification Rule by setting
(rather than ue -2t).
Note that the first of these two terms is the analog of the modification in Sec. 2.7. but it would not be sufficient
here. (Try It.) By substitution,
Equating the Ie -2t-terms on both sides, we have - 2u = Au. Hence u is an eigenvector of A corresponding to
A = -2; thus [see (5)] u = all lIT with any a 'F O. Equating the other terms gives
thus
Collecting terms and reshuffling gives
-v 1 +
V2 =
By addition, 0 = -2a - 4, a = -2, and then v2 = VI
We can simply choose k = O. This gives the answer
-a + 2.
+ 4, say, VI
For other k we get other v; for instance, k = -2 gives v = [-2
= k, v2
= k + 4, thus, v = [k
+ 4]T.
2]T, so that the answer becomes
(5*)
etc. •
Method of Variation of Parameters
This method can be applied to nonhomogeneous linear systems
(6)
k
y'
= A(t)y + get)
SEC. 4.6
161
Nonhomogeneous Linear Systems of ODEs
with variable A = ACt) and general get). It yields a particular solution y(p) of (6) on some
open interval J on the t-axis if a general solution of the homogeneous system y' = A(t)y
on J is known. We explain the method in terms of the previous example.
EXAMPLE 2
Solution
by the Method of Variation of Parameters
Solve (3) in Example I.
Solutioll. A basis of solutions of the homogeneous, ystem is [e -2t
the general solution (4) of the homogenous system may be written
-2t
y(hl =
(7)
[
e
e- 2t
-e -4t]T. Hence
e
'"I
-4t] [ ]
_e-4t
C2 = YU)e.
r
Here, Y(n = [y(!) y(2J is the fundamental matrix (see Sec. 4.2). As in Sec 2.10 we replace the constant
vector e by a variable vector u(t) to obtain a particular solution
yep) = Y(t)u(t).
Substitution into (3) y' = Ay + g gives
Y'u+Yu'=AYu+g.
(8)
Now since y(lJ and
y(2J
are solutions of (he homogeneous system. we have
Y' = AY.
thus
Hence Y' u = AYu, so that (8) reduces to
Yu' = g.
The solution is
here we use that the inverse y- 1 of Y (Sec. 4.0) exists because the detenninant of Y is the Wronskian W, which
is not zero for a basis. Equation (9) in Sec. 4.0 gives the form of y-l.
y-l =
-e -4t
- 2e -6t [ -e -2t
We multiply this by g, obtaining
Integration is done componentwi,e (just
a~
differentiation) and gives
L
t [
u(t) =
-2 - ] dt = [ - 2 t
]
-4e2t
_2e2t + 2
(where + 2 comes from the lower limit of integration). From this and Y in (7) we obtain
e- 2t
Yu =
[
e -2t
4t
e- ] [
-e- 4t
-2t
-2e2t + 2
]
[-2Ie=
2t
-
2e-
2t
+ 2e-
4t
-2te- 2t + 2e- 2t _ 2e- 4t
[-2t -
]
=
2J
-21 + 2
2t
e-
+
The last term on the right is a solution of the homogeneous system. Hence we can absorb it into lh). We thus
obtain as a general solution of the system (3). in agreement with (5*).
(9)
•
162
CHAP. 4
Systems of ODEs. Phase plane. Qualitative Methods
-
----_ Z.Q.i= ~CI==~
1. (General solution) Prove that (2) includes every
solution of (I).
12-91
14.
)' ~ = 3YI -
4Y2
+ 20
)'~
co~ t
GENERAL SOLUTION
Find a general solution. (Show the details of your work.)
2'Y~=Y2+t
y~
=
4. Y; =
+
+ 5 cos t 5. J~
)'2
"2 - 5 sin t
)"~ = 3Y1 -
7. Y~ = -14)"1
y~ = -5Yl
8.
y;
=
+
+
=
+
=
=
y~ =
-4Y1
2)'1
+
5)'1 -
15.
+ 5
2Y2
+ 12
)"2 -
30
+
10(1 - t - t 2 )
IOY2
+
4 - 20t - 6t 2
4)'2
+
+
lit
+ 3e- t
6.\'2
16.
4Yl
+
8Y2
+ 2 cos t
y~ = 6YI
+
2Y2
+
=
-
cust -
16 sin t
14 sint
+ 162
IOY2
Y2 - 3241
-3Ji -
)'~ = 5Y1
+ 9t
= 4Y2
6)'2
IOY1 -
y~ = 6YI -
9 . .\'~
Y~
3t
Y1 )"1
3. Y~
-
17. (Network) Find the currents in Fig. 97 when R = 2.5 D.
L = 1 H, C = 0.04 F, E(t) = 845 sin t Y, and 11(0} = 0,
[2(0) = O. (Show the details.)
18. (Network) Find the currents in Fig. 97 when R = I D.
L = 10 H, C = 1.25 F, E(t) = 10 kY, and 11 (0) = 0,
[2(0) = O. (Show the details.)
15
15T - 20
c
E
10. CAS EXPERIMENT. Undetermined Coefficients.
Find out experimentally how general you mLLst choose
y(jJ). in particular when the components of g have a
different form (e.g., as in Prob. 9). Write a short report,
covering also the situation in the case of the
modification rule.
=
1-161 INITIAL VALUE PROBLEM
Solve {showing details):
II.
y~ = -2Y2
y~
=
+
." ~
=
Network in Probs. 17, 18
19. (Network) Find the CUiTents in Fig. 98 when R1 = 2 D,
R2 = 8 n. L = 1 H. C = 0.5 F. E = 200 Y. (Show the
details.)
L
4t
2YI - 2t
!
)'1(0) = 4, )'2 (0) =
12.
Fig. 97.
4Y2
+
5e
t
Switch
20e- t
y~ = -Yl -
Fig. 98.
13.
Y;
=
YI
y~ = )'1(0)
= 1,
+
)'2
2.\"2
+
)"2(0)
1
+
e 2t
+ t
= -4
-
2t
c
Network in Prob. 19
20. WRITING PROJECT. Undetermined Coefficients.
Write a short report in which YOLL compare the
appl ication of the method of undetennined coeflicients
to a single ODE and to a system of two ODEs, using
ODEs and systems of your choice.
163
Chapter 4 Review Questions and Problems
,
.•
TIONS AND PROBLEMS
1. State some applications that can be modeled by systems
of ODEs.
23. Y ~
=
24. y~
= Y1 -
4.h + 3Y2 + 2
2. What is population dynamics? Give examples.
3. How can you transform an ODE into a system of ODEs?
4. What are qualitative methods for systems? Why are they
important?
y~
=
2Y2 - sin t
3Y1 - 4Y2 - cos t
5. What is the phase plane? The phase plane method? The
phase portrait of a system of ODEs?
6. What is a critical point of a system of ODEs? How did
we classify these points?
7. What are eigenvalues? What role did they play in this
chapter?
8. What does stability mean in general? In connection with
critical points?
9. What does linearization of a system mean? Give an
example.
10. What is a limit cycle? When may it occur in mechanics?
26. (Mixing problem) Tank Tl in Fig. 99 contains initially
200 gal of water in which 160 lb of salt are dissolved.
Tank T2 contains initially 100 gal of pure water. Liquid
is pumped through the system as indicated. and the
mixtures are kept uniform by stirring. Find the amounts
of salt Y1(t) and Y2(t) in Tl and T2 , respectively.
Water,
--
10<
Il~-!2J
GENERAL SOLUTION. CRITICAL POINTS
Find a general solution. Determine the kind and stability of
the critical point. (Show the details of your work.)
11. Y~
12. Y~ = 9Y1
= 4.\'2
I
Y2
13. Y~
I
14. Y1
= Y2
I
Y2
15. y~
=
16 gal/min
1.5.h - 6Y2
I
16. Y1
I
.\'2
I
18. Y1
=
Fig. 99.
Y2
3Y2
3)'1
3)'1
+
3)'2
-3Y1
2Y2
-2Y1
3Y2
3Y1
+
I
Y2 = -5Yl
3Y2
NONHOMOGENEOUS SYSTEMS
Find a general solution. (Show the details.)
y~ = 12Y1
22. y~
= )'1
+
+ 6t
+
1
Y2
+ sin t
Tanks in Problem 26
5Y2
-
[u.!.-~
= 3)'2
---
27. (Critical point) What kind of critical point does y' = Ay
have if A has the eigenvalues -6 and I?
28. (Network) Find the currents in Fig. 100. where
R1 = 0.5 fl, R2 = 0.7 fl, Ll = 0.4 H, L2 = 0.5 H,
E = 1 kV = 1000 V, and ll(O) = 0,/2(0) = O.
Fig. 100.
20. y~
Mixture,
o gal/min
Network in Problem 28
29. (Network) Find the currents in Fig. 10 1 when R = 10 fl,
L = 1.25 H. C = 0.002 F. and 11 (0) = liG) = 3 A.
21. )'~ = )'1 + 2.\'2 + e 2t
Fig. 101.
Network in Problem 29
CHAP. 4
164
Systems of ODEs. Phase plane. Qualitative Methods
130-331 LINEARIZATION
Detelmine the location and kind of all critical points of the
given nonlinear system b) linearization.
30. y~
31. )'~
= )'2
,
Y2
==-~::.".::'.I' -;:~==]= .. ::
=
-9Y2
=
smYI
32. )' ~
33. y~
= COS)'2
=
Y2 - 2Y2
Y; = Yl -
2Y1
2
2
.
:.:==:
Systems of ODEs. Phase Plane. Qualitative Methods
Whereas single electric circuits or single mass-spring systems are modeled by single
ODEs (Chap. 2). networks of several circuits. systems of several masses and springs.
and other engineering problems lead to systems of ODEs, involving several unknown
functions ."1(1), ... , YI1(1)· Of central interest are first-order systems (Sec. 4.2):
y' = f(t, y),
in components,
to which higher order ODEs and systems of ODEs can be reduced (Sec. 4.1). In
this summary we let 11 = 2. so that
y'
(1)
= f(t, y),
Y; = fl(t, Yh Y2)
in components.
.\'~
=
f2(1, .\'1, .\'2)
Then we can represent solution curves as trajectories in the phase plane (the
YIY2-plane), investigate their totality [the "phase portrait" of (1 )J, and study the
kind and stability of the critical points (points at which both f 1 and f 2 are zero),
and classify them as nodes, saddle points, centers, or spiral points (Secs. 4.3, 4.4).
These phase plane methods are qualitative; with their use we can discover various
general properties of solutions without actually solving the system. They are
primarily used for autonomous systems, that is, systems in which t does not occur
explicitly.
A linear system is of the fonn
(2) y'
=
Ay
+ g, where A =
[(/11
°21
If g
(3)
L
:~:J'
y
=
[:J, [:J .
= 0, the system is called homogeneous and is of the form
y' = Ay.
g =
165
Summary of Chapter 4
If all, . . . , a22 are constants, it has solutions Y =
quadratic equation
and x -:f- 0 has components
Xl' X2
xeAt,
where A is a solution of the
determined up to a multiplicative constant by
(These A's are called the eigenvalues and these vectors x eigenvectors of the matrix
A. Further explanation is given in Sec. 4.0.)
A system (2) with g -:f- 0 is called nonhomogeneous. Its general solution is of
the form Y = Yh + Yp, where Yh is a general solution of (3) and Yp a particular
solution of (2). Methods of determining the latter are discussed in Sec. 4.6.
The discussion of critical points of linear systems based on eigenvalues is
summarized in Tables 4.1 and 4.2 in Sec. 4.4. It also applies to nonlinear systems
if the latter are first linearized. The key theorem for this is Theorem L in Sec. 4.5,
which also includes three famous applications, namely the pendulum and van der
Pol equations and the Lotka-Volterra predator-prey population model.
• •••
n
.~
~
~,
CHAPTER
~I\
,,1
5
...........
---r • I
'ill;.
1
1 1/'1\
' I',
...
Series Solutions of ODEs.
Special Functions
In Chaps. 2 and 3 we have seen that linear ODEs with constant coefficients can be solved
by functions known from calculus. However. if a linear ODE has variable coefficients
(functions of x). it must usually be solved by other methods. as we shall see in this
chapter.
Legendre polynomials, Bessel functions, and eigenfunction expansions are the three
main topics in this chapter. These are of greatest importance to the applied mathematician.
Legendre's ODE and Legendre polynomials (Sec. 5.3) are likely to occur in problems
showing spherical symmetry. They are obtained by the power series method (Secs. 5.1,
5.2). which gives solutions of ODEs in power series.
Bessel's ODE and Bessel functions (Secs. 5.5, 5.6) are likely to occur in problems
showing cylindrical symmetry. They are obtained by the Frobenius method (Sec. 5.4),
an extension of the power series method which gives solutions of ODEs in power series,
possibly multiplied by a logarithmic tenn or by a fractional power.
Eigenfunction expansions (Sec. 5.8) are infinite series obtained by the SturmLiouville theory (Sec. 5.7). The terms of these series may be Legendre polynomials or
other functions, and their coefficients are obtained by the orthogonality of those functions.
These expansions include Fourier series in terms of cosine and sine, which are so
in1portant that we shall devote a whole chapter (Chap. II) to them.
Special functions (also called higher functions) is a name for more advanced functions
not considered in calculus. If a function occurs in many applications, it gets a name, and
its properties and values are investigated in all details, resulting in hundreds of formulas
which together with the underlying theory often fill whole books. This is what has
happened to the gamma, Legendre, Bessel, and several other functions (take a look into
Refs. [GRI], [GRIO], [All] in App. 1).
Your CAS knows most of the special functions and corresponding formulas that you
will ever need in your later work in industry, and this chapter will give you a feel for the
basics of their theory and their application in modeling.
COMMENT You can study this chapter directly after Chap. 2 because it needs no
material from Chaps. 3 or 4.
Prerequisite: Chap. 2.
Sections that may be omitted il1 a shorter course: 5.2, 5.6-5.8.
References and Answers to Problems: App. I Part A, and App. 2.
166
167
SEC. 5.1
Power Series Method
5.1
Power Series Method
The power series method is the standard method for solving linear ODEs with variable
coefficients. It gives solutions in the form of power series. These series can be used for
computing values, graphing curves, proving formulas, and exploring properties of solutions,
as we shall see. In this section we begin by explaining the idea of the power series method.
Power Series
From calculus we recall that a power series (in powers of x - xo) is an infinite series of
the form
(1)
2:
a",(x - xo)m
= ao +
al (x - xo)
+
+
a2(x - XO)2
m=O
Here, x is a variable. ao, at. a 2, ... are constants, called the coefficients of the series.
is a constant, called the center of the series. In particular, if Xo = 0, we obtain a power
series in powers of x
Xo
2:
(2)
amx ln
=
ao
+
alx
a3 x3 +
+ a2x2 +
m=O
We shall assume that all variables and constants are real.
Familiar examples of power series are the Maclaurin series
I
00
--- =
1- x
eX
2:
(Ixl <
= 1 + x + x 2 + ...
xm
1, geometric series)
m=O
=
2:
m=O
cosx =
2:
'1'11=0
sin x =
2:
m=O
x2
xm
111!
= I +x+
2!
(_l)mx2m
(2111) !
(_I)mx 2m+l
(2m
+ I)!
I
+
-
x2
2!
=x-
x3
3!
+ ...
X4
+
x3
+ ...
-
4!
x5
+ 5!
3!
-
+ ....
We note that the term "power series" usually refers to a series of the form (1) lor (2)]
but does not include series of negative or fractional powers of x. We use 111 as the
summation letter, reserving n as a standard notation in the Legendre and Bessel equations
for integer values of the parameter.
Idea of the Power Series Method
The idea of the power series method for solving ODEs is simple and natural. We describe
the practical procedure and illustrate it for two ODEs whose solution we know, so that
168
CHAP. 5
Series Solutions of ODEs. Special Functions
we can see what is going on. The mathematical justification of the method follows in the
next section.
For a given ODE
y"
+ p(x)y' +
q(x)y
=0
we first represent p(x) and q(x) by power series in powers of x (or of x - Xo if solutions
in powers of x - xo are wanted). Often p(x) and q(x) are polynomials, and then nothing
needs to be done in this first step. Next we assume a solution in dle form of a power series
with unknown coefficients,
(3)
y
L
=
arnxTn
=
ao
+
(/tX
+
a2x2
+
a3x3
+
m=O
and insert this series and the series obtained by term wise differentiation,
(a)
)"
=L
mamxrn - 1
= a] + 2a2x +
3a3x2
+ ...
m~]
(4)
(b)
)'''
=
L
m(m -
l)a m x m - 2
=
2(/2
+ 3· 2a3x + 4· 3a4x2 + ..
m=2
into the ODE. Then we collect like powers of x and equate the sum of the coefficients of
each occuning power of x to zero, starting with the constant terms, then taking the terms
containing x, then the terms in x 2 , and so on. This gives equations from which we can
determine the unknown coefficients of (3) successively.
Let us show this for two simple ODEs that can also be solved by elementary methods,
so that we would not need power series.
E X AMP L E 1
Solve the following ODE by power series. To grasp the idea. do this by hand: do not use your CAS (for
which you could program the Whole process).
y' = 2xy.
Soluti01l. We insert (3) and (4a) into the given ODE. obtaining
We must perform the multiplication by
a]
+
211 2 X
2110X
2~
+
+
on the right and can write the resulting equation conveniently as
3a3x2
2111X2
+
4a4x3
+
3
2112 X
+ 5a5-\·4 +
6a&y5
+ ...
+
2114X5
+ ..
2a3x4
+
For this equation to hold, the two coefficients of every power of x on both sides must be equal. that is.
Hence
a3 =
0,
1I5 =
0, ... and for the coefficients with even SUbscripts.
ao
3! '
SEC. 5.1
Power Series Method
169
ao remain~ arbitrary. With these coefficients the series (3) gives the following solution. which you should confirm
by the method of separating variables.
More rapidly, (3) and (4) give for the ODE y' = 2\:"
x
] ·UIXo
+
x
x
L
L
m 1
InunzX = 2\-
m=O
1n=2
L
=
DmXTn
2am x'11 + 1
m=O
Now, to get the same general power on both sides, we make a "shift of index" on the left by sening III = S + 2,
thus 111 - I = s + I. Then am becomes lIs+2 and x",-I becomes i'+I. Also the summation. which started with
m = 2. now starts with s = 0 because s = /11 - 2. On the right we simply make a change of notation /11 = S,
hence lim = as and X"H I = x s + 1: abo the summation now starts with s = O. This altogether gives
<Xl
L (s + 2)aS+2xs+I = L 2llsXS+I.
+
al
s=o
s~o
Every occurring power of x must have the same coefficient on both sides: hence
and
(s
+
2
or
2kls+2 = 2l1s
a s +2 =
s
+ 2
as'
For s = 0, I. 2.... we thus have a2 = (2/2)lIo, a3 = (2/3)aI = O. a 4 = (2/4)a2' ... as before.
EXAMPLE 2
•
Solve
y" +
Solutioll.
O.
y =
By in,erting (3) and (4b) into the ODE we have
x
L
/11(111 -
l)llmxm-2
m~2
L
+
7n
am x
=
O.
m~O
To obtain the same general power on both selies. we set 11/ =
and then we take the laner to the right side. This gives
S
+ 2 in the first series and 111 = s in the second,
""
L (s + 2)(.\' +
ex;
I)lI s +2"'s =
5=0
L {{sXS.
5=0
S
Each power X must have the same coefficient on both sides. Hence
recursion formula
(s
+
2)(s
+ I )lls+2 =
-as'
This gives the
{{s
a s +2 = -
-,----,--"----(s + 2)(s + I)
(s = 0, 1, .. ').
We thu, obtain successively
112 =
lI4 =
and so on. ao and
{{I
110
lIo
2' I
2!
a2
110
4'3
4!
a3 =
a5 =
al
al
3·2
3!
a3
al
5'4
5!
remain arbitrary. With these coefficients the series (3) becomes
'" = ao
+
{{IX -
ao
2!
x
2
-
~ \.3 + {{o \.4 +!!.!.
3! .
4! .
5!
X
5
+
CHAP. 5
170
Series Solutions of ODEs. Special Functions
Reordering terms lwhich is permissible for a power series), we can write this in the form
+
X3
a1 ( x -
3!
+
X5
5! -
+ ...
)
and we recognize the familiar general solution
•
y = Go cosx + G1 sinx.
Do we need the power series method for these or similar ODEs? Of course not; we used
them just for explaining the idea of the method. What happens if we apply the method
to an ODE not of the kind considered so far, even to an innocent-looking one such as
y" + xy = 0 ("Airy's equation")? We most likely end up with new special functions given
by power series. And if such an ODE and its solutions are of practical (or theoretical)
interest, we name and investigate them in terms of formulas and graphs and by numeric
methods.
We shall discuss Legendre's, Bessel's, and the hypergeometric equations and their
solutions, to mention just the most prominent of these ODEs. To do this with a good
understanding, also in the light of your CAS. we first explain the power series method
(and later an extension, the Frobenius method) in more detail.
11-10 I
POWER SERIES METHOD: TECHNIQUE,
FEATURES
Apply the power series method. Do this by hand, not by a
CAS, so that you get a feel for the method, e.g., why a
series may terminate, or has even powers only, or has no
constant or linear terms, etc. Show the details of your work.
1. y' - y = 0
3. y" + 4y = 0
5. (2 + x)y' = y
7. y'
=
Y
9. y" - )"
111-161
+
x
= 0
2. y' + xy = 0
+ 3(1 +
X2)y
= 0
+ 12x2»)'
8. (x 5 + 4x 3 )y' = (5x 4
10. y" - xy'
+
y
=
0
CAS PROBLEMS. INITIAL VALUE
PROBLEMS
Solve the initial value problems by a power series. Graph
the partial sum s of the powers up to and including x 5 . Find
the value of s (5 digits) at Xl'
5.2
+
4y = 1.
=
1
+
= y -
yeO) = 1.25.
15. y"
+
yeO)
y2,
yeO) =
3xy'
=
=
y2,
14. (x - 2)y' = xy,
/(0)
4. y" - y = 0
6. y'
11. y'
12. y'
13. /
1,
+
+
!,
Xl = 1
yeO) =
30y = 0,
Xl =
!1T
Xl
yeO) = 4,
2y = 0,
Xl = 0.5
16. (1 - X2)y" - 2xy'
/ (0) = l.875,
Xl = 0.2
=
0,
Xl
= 2
1,
yeO)
0,
0.5
17. WRITING PROJECT. Power Series. Write a review
(2-3 pages) on power series as they are discussed in
calculus, using your own formulation and examplesdo not just copy passages from calculus texts.
18. LITERATURE PROJECT. Maclaurin Series.
Collect Maclaurin series of the functions known from
calculus and arrange them systematically in a list that
you can use for your work.
Theory of the Power Series Method
In the last section we saw that the power series method gives solutions of ODEs in the
form of power series. In this section we justify the method mathematically as follows. We
first review relevant facts on power series from calculus. Then we list the operations on
power series needed in the method (differentiation, addition, multiplication, etc.). Near
the end we state the basic existence theorem for power series solutions of ODEs.
SEC. 5.2
Theory of the Power Series Method
171
Basic Concepts
Recall from calculus that a power series is an infinite series of the form
oc
(1)
~ am(x - xoyn = ao + al (x - Xo)
+
a2(X - XO)2
+
m~O
As before, we assume the variable x, the center .\"0' and the coefficients ao,
real. The nth partial sum of (1) is
aI, • . •
to be
where n = 0, 1, .... Clearly, if we omit the terms of s" from (I), the remaining expression
is
(3)
This expression is called the remainder of (1) after the tenn a,/x - xo)n.
For example, in the case of the geometric series
I
+ x + X2 + ... + xn + ...
we have
So = 1,
Sl
=
+ x.
etc.
In this way we have now associated with (1) the sequence of the partial sums
so(x), SI(X), S2(X), .... If for some x = Xl this sequence converges, say,
lim sn(x I )
= S(XI)'
11.-----"'00
then the series (I) is called convergent at X
sum of (I) at Xl, and we write
= Xl,
the number S(XI) is called the value or
00
S(XI)
= ~ am(XI - XOrn.
m~O
Then we have for every n,
(4)
If that sequence diverges at X = Xl> the series (I) is called divergent at X = Xl.
In the case of convergence, for any positive E there is an N (depending on E) such that,
by (4),
(5)
for all n
> N.
172
CHAP. S Series Solutions of ODEs. Special Functions
Geometrically, this means that all Sn(Xl) with n > N lie between s(x l ) - E and s(x l ) + E
(Fig. 102). Practically, this means that in the case of convergence we can approximate
the sum S(Xl) of (I) at Xl by Sn(Xl) as accurately as we please, by taking 11 large enough.
Convergence Interval. Radius of Convergence
With respect to the convergence of the power series (I) there are three cases, the useless
Case I, the usual Case 2, and the best Case 3, as follows.
Case 1. The series (1) always converges at x = xo, because for x = Xo all its terms are
zero, perhaps except for the first one, ao. In exceptional cases x = Xo may be the only x
for which (l) converges. Such a series is of no practical interest.
Case 2. If there are further values of x for which the series converges, these values form
an interval, called the convergence interval. If this interval is finite, it has the midpoint
xo, so that it is of the form
Ix - xol <
(6)
(Fig. 103)
R
and the series (1) converges for all x such that Ix - xol < R and diverges for all x such
that Ix - xol > R. (No general statement about convergence or divergence can be made
for x - Xo = R or -R.) The number R is called the radius of convergence of 0). (R is
caned "radius" because for a complex power series it is the radius of a disk of convergence.)
R can be obtained from either of the formulas
(7)
(a)
R
= l/lim
111~'JC
Vfa:f
(b)
=
R
1
him
/ m_x,
I
am+l
lint
I
provided these limits exist and are not zero. [If these limits are infinite, then
only at the center xo.]
(1)
converges
Case 3. The convergence interval may sometimes be infinite, that is, (l) converges for
all x. For instance, if the limit in (7a) or (7b) is zero, this case occurs. One then writes
R = x, for convenience. (Proofs of all these facts can be found in Sec. 15.2.)
For each x for which (1) converges. it has a certain value sex). We say that (1) represents
the function sex) in the convergence interval and write
00
seX) =
L
(Ix - xol <
{/m(X - Xo)m
R).
m~O
Let us illustrate these three possible cases with typical examples.
Divergence iconvergence
~E--+-E_I
I
I
I
Fig. 102.
I
Inequality (S)
I
Fig. 103.
-R·
~I'
I
------j Divergence
R-I
I
Convergence interval (6) of a power
series with center Xo
SEC. 5.2
173
Theory of the Power Series Method
E X AMP LEI
The Useless Case 1 of Convergence Only at the Center
In the case of the series
~ m!x'" = 1 + x + 2x2 + 6x 3 + ...
m=O
we have am.
=
Ill!, and in (7b).
(1/1
a",+1
--=
am.
+ I)!
=m+1-,,<o
as
,n
----7
•
Thus this series converges only at the center x = O. Such a series is useless.
E X AMP L E 2
ro.
In!
The Usual Case 2 of Convergence in a Finite Interval. Geometric Series
For the geometric series we have
1
x
--=~x
I-x
m
=I+x+x
2
(Ixl
+ ...
In fact, am = 1 for all m, and from (7) we obtain R = I, that is. the geometric series converges and
1/(1 - x) when Ixl < L
E X AMP L E 3
< I).
m=O
represent~
•
The Best Case 3 of Convergence for All x
In the case of the series
+
1+x
x2
+ ...
2!
we have a", = 11m!. Hence in (7b),
l/{m + I)!
11m!
111
+
1
-,,0
as
~
co,
•
so that the series converges for all x.
E X AMP L E 4
111
Hint for Some of the Problems
Find the radius of convergence of the series
OJ
~
L.J
(
__I)'"
_
8'"
x3
.3m_
.\
- I -
8 +
x6
x9
ill + - ....
64 -
'tn=O
Solution.
This is a senes in powers of t = x 3 with coefficients am = (-1)"'/8"', so that in (7b),
I
a"'+1
Thus R = 8. Hence the series converges for
am
Itl
=
I=
~
8",+1
Ix 3 1<
=
.!.
8'
8, that is,
Ixl
< 2.
•
Operations on Power Series
In the power series method we differentiate, add, and multiply power series. These three
operations are permissible, in the sense explained in what follows. We also list a condition
about the vanishing of all coefficients of a power series, which is a basic tool of the power
series method. (Proofs can be found in Sec. 15.3.)
174
CHAP. 5
Series Solutions of ODEs. Special Functions
Termwise Differentiation
A power series may be d(fferenlialed Term by Term. More precisely: if
"L
y(x) =
am(x -
X O)111
m~O
converges for Ix - xol < R, where R > 0, then the series obtained by differentiating term
by term also converges for those x and represents the derivative y' of y for those x,
that is,
x
y' (x)
=
"L
17Ul m {.X -
xo)'n-l
(Ix - xol < R).
m~l
Similarly,
y"(x)
=
"L
m(m -
l)am(x - xo)m-2
(Ix - xol <
R), etc.
m~2
Termwise Addition
Two power series lIlay be added term by term. More precisely: if the series
GC
"L
and
(8)
bm(x - xo)m
m~O
have positive radii of convergence and their sums are f(x) and g(x). then the series
CXJ
"L
(am
+ bm)(x -
xo)m
m~O
converges and represents f(x) + g{x) for each x that lies in the interior of the convergence
interval of each of the two given series.
Termwise Multiplication
Two power series may be multiplied Tel7ll by Term. More precisely: Suppose that the series
(8) have positive radii of convergence and let f(x) and g(x) be their sums. Then the
series obtained by multiplying each term of the first series by each term of the second
series and collecting like powers of x - Xo, that is,
GC
"L
(aob m
+ a1b m- 1 + ... +
ambo)(x - xo)m
m~O
converges and represents f(x)g(x) for each x in the interior of the convergence interval of
each of the two given series.
SEC. S.2
Theory of the Power Series Method
175
Vanishing of All Coefficients
If a power series has a positive
radius of convergence and a sum that is identically zero
throughout its illterval of convergence, then each coeffIcient of the series must be zero.
Existence of Power Series Solutions of ODEs.
Real Analytic Functions
The properties of power series just discussed form the foundation of the power series
method. The remaining question is whether an ODE has power series solutions at all. An
answer is simple: If the coefficients p and lj and the function r on the right side of
(9)
y"
+ p(x)y' + q(x)y =
r(x)
have power series representations, then (9) has power series solutions. The same is true
if h, p, q, and r in
(10)
h(x)y"
+ p(x)y' + q(x»)'
= r(x)
have power series representations and h(xo) *- 0 (xo the center of the series). Almost all
ODEs in practice have polynomials as coefficients (thus te1l11inating power series), so that
(when r(x) == 0 or is a power series, too) those conditions are satisfied, except perhaps
the condition h(xo) *- O. If h(xo) *- 0, division of (10) by h(x) gives (9) with p = pIli,
q = qlh, r = 'ilh. This motivates our notation in (0).
To formulate all this in a precise and simple way, we use the following concept (which
is of general interest).
DEFINITION
Real Analytic Function
A real function f(x) is called analytic at a point x = Xo if it can be represented by
a power series in powers of x - Xo with radius of convergence R > O.
Using this concept, we can state the following basic theorem.
THEOREM 1
Existence of Power Series Solutions
If p,
x =
q, and r in (9) are analytic at x = x o, then every SoluTion of (9) is analYTic aT
Xo and can thus be represenTed by a power series in powers of x - Xo with
radius of convergence R > O. Hence the same is true if h, p, q, and r in (10) are
analytic at x = Xo and h(xo) *- O.
The proof of this theorem requires advanced methods of complex analysis and can be
found in Ref. [All] listed in App. 1.
We mention that the radius of convergence R in Theorem I is at least equal to the
distance from the point x = Xo to the point (or points) closest to Xo at which one of the
functions p, q, r, as functions of a complex variable, is not analytic. (Note that that point
may not lie on the x-axis but somewhere in the complex plane.)
176
CHAP. 5
..
=
/1-12/
Series Solutions of ODEs. Special Functions
RADIUS OF CONVERGENCE
Determine the radius of convergence. (Show the details.)
0::
1.
L
y7n
~
1= 0)
(c
m~O C
15.
2
L
p~l (p
/16-23/
(m
+
L
I)m
(x -
3)271>
(-I )7l1 x 4m
"n1=O
5.
L
m~O
00
4 xm
(2111 + 2 )(2m + )
x
(m!)
X
2m" 10
(-l)m
~
(1)2m
7 'L..~xm~2
~
8. L..
(4m)!
--4
71>~1 (m!)
(Ill
m=4
10.
+
xnt
3)2
x m.
(111 - 3)4
= (7_111.
)'
L.. - - 2 -
~
11. L..
Xm
111
1
"r"
1m
(x - 2 1T)
7n=1
12•
(m + 1)111
~
L..
71l~1 (2m + \)!
/13-15/
16. y " + xy = 0
,
17. .I' " - Y + x 2 y = 0
,
18. y " - y + xy = 0
,
19. y " + 4xy = 0
20. y " + lxy + y = 0
21. y" + (I +
X2»)' =
23. (2x 2
-
2
0
2).1'
=
0
3x + I)y" + 2xv' - 2." = 0
24. TEAM PROJECT. Properties from Power Series.
In the next sections we shall define new functions
(Legendre functions. etc.) by power series. deriving
properties of the functions directly from the series. To
understand this idea. do the same for functions familiar
from calculus. using Maclaurin series.
(a) Show that cosh x + sinh x = eX. Show that
cosh x > 0 for all x. Show that eX ;:;:; e- x for all
x;:;:; O.
(b) Derive the differentiation formulas for eX. cos x,
sinx. 11(1 - x) and other functions of your choice.
Show that (cos xl" = -cos x. (cosh :d' = cosh x.
Consider integration similarly.
~
",~1
POWER SERIES SOLUTIONS
22. y" - 4xy' + (4x
I)m
--6. ~
L..
2
m~O
l)!
,
(2m)!
(
+
Find a power series solution in powers of x. (Show the
details of your work.)
m~l
4.
xp + 4
p
x21n+l
SHIFTING SUMMATION INDICES
(CF. SEC. 5.1)
Thi~ is often convenient or nece~sary in the power series
method. Shift the index so that the power under the
summation sign is xS. Check by writing the first few terms
explicitly. Also determine the radius of convergence R.
(c) What can you conclude if a series contains only
odd powers? Only even powers? No constant tenn? If
all its coefficients are positive? Give examples.
(d) What properties of cos x and sin x are lIot obvious
from the Maclaurin series? What properties of other
functions?
25. CAS EXPERIMENT. Information from Graphs of
Partial Sums. In connection with power series in
numerics we use partial sums. To get a feel for the
accuracy for various x. experiment with sin x and
graphs of partial sums of the Maclaurin series of an
increasing number of terIllS, describing qualitatively
the "breakaway points" of these graphs from the
graph of sin x. Consider other examples of your own
choice.
SEC. 5.3
5.3
Legendre's Equation. Legendre Polynomials Pn(x)
Legendre's Equation.
Legendre Polynomials Pn{x)
In order to first gain skill, we have applied the power series method to ODEs that can
also be solved by other methods. We now turn to the first "big" equation of physics, for
which we do need the power series method. This is Legendre's equationl
(l - x 2 )y" - 2AY' + n(n
(1)
+
I)y
=
0
where n is a given constant. Legendre's equation arises in numerous problems, particularly
in boundary value problems for spheres (take a quick look at Example I in Sec. 12.10).
The parameter n in (1) is a given real number. Any solution of (1) is called a Legendre
function. The study of these and other "higher" functions not occurring in calculus is
called the theory of special functions. Further special functions will occur in the next
sections.
Dividing 0) by the coefficient 1 - x 2 of y". we see that the coefficients -2x/(1 - x 2 )
and n(n + 1)/(1 - x 2 ) of the new equation are analytic at x = O. Hence by Theorem I,
in Sec. 5.2. Legendre's equation has power series solutions of the form
(2)
Substituting (2) and its derivatives into (1), and denoting the constant n(n
k, we obtain
ex:;
l) amxm.-2
00
2x.L mamxm - l
-
+ 1) simply by
+
k"L amx m = O.
m=1
"m,=2
By writing the first expression as two separate series we have the equation
00
"L
00
X)
m(m
l)amx m - 2
-
"L
m(1Il -
l)am x m -
"L
X)
2mam x m +
"L
kamx m
= O.
m=2
To obtain the same general power X S in all four series, we set m - 2 = s (thus m = s + 2)
in the first series and simply write s instead of III in the other three series. This gives
00
"L (s + 2)(s +
00
l)as +2-C\:s -
"L s(s -
00
I)asx s -
00
"L 2sasxs + "L kasxs =
O.
lADRIEN-MARIE LEGENDRE (1752-1833). French mathematician. who became a professor in Paris in
1775 and made important contributions to special functions, elliptic integrals, number theory, and the calculus
of variations. His book Elements de geollletrie (1794) became very famous and had 12 editions in less than 30
years.
Fonnulas On Legendre functions may be found in Refs. [GRJ] and [GRIO].
178
CHAP. 5
Series Solutions of ODEs. Special Functions
(Note that in the first series the summation begins with s = 0.) Since this equation with
right side 0 must be an identity in x if (2) is to be a solution of (1), the sum of the
coefficients of each power of x on the left must be zero. Now X O occurs in the first and
fourth series and gives [remember that k = n(n + 1)]
(3a)
"\"1
2 . la2
+
l1(n
+
= O.
1) ao
occurs in the first, third, and fourth series and gives
(3b)
3' 2a3 +[-2 +
The higher powers x 2 , x 3 ,
•••
(3c)
(s
+
+
2)(s
+
n(n
= O.
l)]aI
occur in all four series and give
l)aS +2
+
[-s(s -
I) -
2s
+
+
n(n
l)]a s
= O.
The expression in the brackets [ .. ·1 can be written (n - s)(n + s + I), as you may
readily verify. Solving (3a) for a2 and (3b) for a 3 as well as (3c) for a s +2' we obtain the
general formula
(n - s)(n
(4)
(s
+s +
+ 2)(s +
1)
(s
1)
=
0, 1, ... ).
This is called a recurrence relation or recursion formula. (Its derivation you may verify
with your CAS.) It gives each coefficient in terms of the second one preceding it except
for a o and aI, which are left as arbitrary constants. We find successively
n(n
a2 = -
+
(n -
I)
2!
I)(n
+
2)
+
4)
3!
(II - 2)(11
+
3)
(n -
5·4
4·3
(n - 2)1l(11
3)(n
+
1)(/1
+
(11 - 3)tll -
3)
I )(Il
4!
+ 2)(11
+
4)
5!
and so on. By inserting these expressions for the coefficients into (2) we obtain
(5)
y(x)
=
aoY! (x)
(n -
2)11(11
+
aIY2(x)
where
+
11(11
1)
2!
+
(6)
Yl(X) =
1-
(7)
)'2(X) =
x - - - - - - - x3
(n -
X2
l)(ll
3!
+
2)
+
1)(11
+
3)
4!
+
(n - 3)(11 -
x4 -
1)(11
5!
+
+ ...
2)(11
+
4)
x5
-
+ ....
These series converge for Ixl < (see Prob. 4; or they may terminate, See below). Since
(6) contains even powers of x only, while (7) contains odd powers of x only, the ratio
YtiY2 is not a constant, so that Yl and Y2 are not proportional and are thus linearly
independent solutions. Hence (5) is a general solution of (I) on the interval - I < x < I.
SEC. 5.3
179
Legendre's Equation. Legendre Polynomials Pn(x)
Legendre Polynomials
Pn{x)
In various applications. power series solutions of ODEs reduce to polynomials. that i~.
they terminate after finitely many terms. This is a great advantage and is quite common
for special functions. leading to various important families of polynomials (see Refs. [GR I]
or [GRIO] in App. 1). For Legendre's equation this happens when the parameter n is a
nonnegative integer because then the right side of (4) is zero for s = n, so that an +2 = 0,
a n +4 = 0, (In+6 = 0, .... Hence if n is even, hex) reduces to a polynomial of degree n.
If II is odd, the same is true for Y2(X). These polynomials, multiplied by some constants.
are called Legendre polynomials and are denoted by Pn(x). The standard choice of a
constant is done as follows. We choose the coefficient an of the highest power xn as
I . 3 . 5 . . . (2n - I)
an =
(8)
(n
n!
a positive integer)
(and an = 1 if n = 0). Then we calculate the other coefficients from (4). solved for as in
terms of as +2, that is,
(9)
(Is
(s + 2)(s + 1)
= - ------(n - s)(n + s + 1)
(ls+2
(s
~
n - 2).
The choice (8) makes P ,,(I) = 1 for every n (see Fig. 104 on p. 180); this motivates (8).
From (9) with s = 11 - 2 and (8) we obtain
11(11 - 1)(2n)!
1)
n(n -
l)
2(211 -
Using (2n)! = 2n(211 - 1)(211 - 2)!,
obtain
II!
(In = -
=
n(1I -
2(211 - 1)2n( 11 !)2
1)!, and n! = n(1I -
n(n - 1)2n(2n - 1)(2n - 2)!
an -2 = - ----'----'-----'---_..:...:.._---'--2(211 - 1)2nn(1I - I)! n(n - 1)(n - 2)!
n(11 -
1)211(211 -
1)
cancels. so that we get
(211 - 2)!
Similarly,
(11 - 2)(n - 3)
4(211 - 3)
(211 - 4)!
2 n 2! (11 -
and so on, and in general, when
(10)
(In-2m
=
11 -
2111
~
2)!
(n -
4)!
0,
(211 - 2111)!
(-I)m - - - - - - - - - 2nm! (n - Ill)! (n - 2m)!
1)(n -
2)!, we
180
CHAP. 5
Series Solutions of ODEs. Special Functions
The resulting solution of Legendre's differential equation (l) is called the Legendre
polynomial of degree n and is denoted by P n(x).
From (10) we obtain
M
P n(x) =
'L
(2n - 2m)!
(-1)'m - - - ' - - - - - - ' - - - - x n - 2'm
2nm! (n - m)! (n - 2m)!
(11)
where M = nl2 or (11 - I )/2, whichever is an integer. The first few of these functions are
(Fig. 104)
(11')
Po(X)
= 1,
P 2 (x)
=
P 4 (x)
= ~(35x4 - 30x 2 + 3),
!(3x 2
-
1),
PI(X)
=
P 3 (x)
= !(5x 3
P 5(x)
=
x
-
3x)
~(63x5 - 70x 3
+ I5x)
and so on. You may now program (11) on your CAS and calculate Pn(x) as needed.
The so-called orthogonality of the Legendre polynomials will be considered in
Sees. 5.7 and 5.8.
x
Fig. 104.
======= ="
Legendre polynomials
-
1. Verify that the polynomials in (11') satisfy Legendre's
equation.
2. Derive (11 ') from (11).
3. Obtain P6 and P 7 from (11).
4. (Convergence) Show that for any 11 for which (6) or
(7) does nol reduce to a polynomial, the series has
radius of convergence 1.
5. (Legendre function Qo(x) for n = 0) Show that (6)
with 11 = 0 gives Yl(X) = Po(x) = I and (7) gives
Y2(X) = x
2
+ -
=x+
3!
x3
3
x3
+
(-3)(-1)·2·4
5!
x5
+ ...
X5
I
I +x
+-+···=-In--.
5
2
I-x
Legendre's Equation. Legendre Polynomials Pn{x)
SEC. 5.3
Verify this by solving (\) with
and separating variables.
z
0, setting
/I =
=
y'
=
6. (Legendre function -Ql(X) for II
1) Show that (7)
with 11 = I gives )"2(X) = PI(x) = x and (6) gives
)"I(X) = -Ql(X) (the minus sign in the notation being
conventional),
)"I(X)
= I -
3
I - x
5
1
7. (ODE) Find a solution of
2
+
+
l)y = 0. a
by reduction to the Legendre equation.
(a
-
x )y" -
2xy'
n(/1
*'
r
0.
8. [Rodrigues's formula (12)]2 Applying the binomial
theorem to (X2 - I)n, differentiating it 11 times term
by term. and comparing the result with (II), show
that
(12)
9. (Rodrigues's formula) Obtain (II ') from (12).
110-131
(13)
(b) Potential theory. Let Al and A2 be two points in
space (Fig. 105, r2 > 0). Using (13), show that
I +x
I - x
= 1- - x l n - -
2
(a) Legendre polynomials. Show that
is a generating function of the Legendre polynomials.
Hint: Start from the binomial expansion of 11"\ 1 - v.
then set v = 2xlI - u 2 • multiply the powers of
2m - u 2 out. collect all the terms involving un, and
verify that the slim of these terms is Pn(x)u n .
(r + .~ + .~5 + ...)
I
2
181
Vr12
+
r22
2rlr2 cos
-
e
This formula has applications in potential theory.
(Qlr is the electrostatic potential at A2 due to a
charge Q located at Al . And the series expresses I1r
in terms of the distances of Al and A2 from any origin
o and the angle e between the segments OA I and
OA 2 ·)
CAS PROBLEMS
10. Graph P 2 (x) • ...• P IO (.\) on common axes. For what
\" (approximately) and II = 2... " 10 is Ipn(x)1 <!?
It. From what
/I on will your CAS no longer produce
faithful graphs of P n(x)? Why?
12. Graph Qo(x), QI (x), and some further Legendre
functions.
13. Substitute asxs
+ a s + IX s + 1 + as+2xs+2 into Legendre's
Fig. 105.
Tearn Project 14
(c) Further applications of (13). Show that
Pn(l) = I, P n ( -I) = (-It'. P 2n + 1 (0) = 0, and
equation and obtain the coefficient recursion (4).
P 2n(O)
14. TEAM
PROJECT.
Generating
Functions.
Generating functions playa significant role in modem
applied mathematics (see [GR5]). The idea is simple.
If we want to study a certain sequence (fn(x» and can
find a function
C(u, x) =
L""
fn(x)u n ,
n=O
we may obtain properties of (f ,,(x» from those of C.
which ""generates" this sequence and is called a
generating function of the sequence.
= (-I)n'I'3'"
(211 -
1)/[2,4", (2n)].
(d) Bonnet's recursion. 3 Differentiating (\3) with
respect to u, using (13) in the resulting formula, and
comparing coefficients of un, obtain the Bonnet
recursion
(14)
(II
+
l)Pn+l(x) = (211
+
I)xP,/:r) - IlPn_I(X),
where Il = I, 2, .... This formula is useful for
computations, the loss of significant digits being small
(except near zeros). Try ( 14) out for a few computations
of your own choice.
20UNDE RODRIGUES (1794-1851). French mathematician and economist.
30SSIAN BONNET (1819-1892), French mathematician. whose main work was in differential geometry.
CHAP. 5 Series Solutions of ODEs. Special Functions
182
15. (Associated Legendre functions) The associated
and are solutions of the ODE
Legendre functions Pnk(x) play a role in quantum
physics. They are defined by
(I - x 2 )y" (16)
+
(15)
2\)"
[n(n + I) - ~
]
1- x
2
y
=
O.
Find P 11(X), P 21(X), P 22(X), and P42(X) and verify that
they satisfy (16).
5.4
Frobenius Method
Several second-order ODEs of considerable practical importance-the famous Bessel
equation among them-have coefficients that are not analytic (definition in Sec. 5.2), but
are "not too bad," so that these ODEs can still be solved by series (power series times a
logarithm or times a fractional power of x, etc.). Indeed, the following theorem permits
an extension of the power series method that is called the Frobenius method. The latteras well as the power series method itself-has gained in significance due to the use of
software in the actual calculations.
THEOREM 1
Frobenius Method
Let b(x) and c(x) be any Junctions that are analytic at x
y"
(1)
b(x),
c(x)
x
+ -- y + -v
2
X
=
= O. Then the ODE
0
-
has at least one solution that can be represented in the JOI7I1
co
y(x) = xT.L a",x'in = xT(ao
(2)
+ alx + a 2 x 2 + ...)
(ao
*" 0)
7n~0
where the exponent r may be any (real or complex) number (and r is chosen so that
ao
0).
The ODE (1) also has a second solution (such that these two solutions are linearly
independent) that may be similar to (2) (with a different r and different coefficients)
or m£ly contain a logarithmic tenn. (Details in Theorem 2 below.)4
*"
For example, Bessel's equation (to be discussed in the next section)
y"
+ 1 y' +
X
(X2 - V2)
x2
V =
0
(va parameter)
-
4GEORG FROBENIUS (1849-1917), German mathematician, also known for his work on matrices and in
group theory.
0 is no restriction: it
In this theorem we may replace x by x - Xo with any number xo. The condition ao
simply means that we factor out the highest possible power of x.
The singular point of (1) at x = 0 is sometimes called a regular singular point, a term confusing to the
student, which we shall not use.
*
SEC. 5.4
183
Frobenius Method
=
=
=
is of the form (I) with b(x)
1 and c(x) x 2 - v 2 analytic at x 0, so that the theorem
applies. This ODE could not be handled in full generality by the power series method.
Similarly, the so-called hypergeometric differential equation (see Problem Set 5.4) also
requires the Frobenius method.
The point is that in (2) we have a power series times a single power of x whose exponent
r is not restricted to be a nonnegative integer. (The latter restriction would make the whole
expression a power series, by definition; see Sec. 5.1.)
The proof of the theorem requires advanced methods of complex analysis and can be
found in Ref. [A 11] listed in App. I.
Regular and Singular Points
The fonowing commonly used terms are practical. A regular point of
y"
+ p(x)y' +
q(x)y = 0
is a point Xo at which the coefficients p and q are analytic. Then the power series method
can be applied. If Xo is not regular, it is called singular. Similarly, a regular point of the
ODE
h(x)y"
+ p(x)y' (x) +
q(x)y = 0
is an Xo at which h. p. q are analytic and h(xo) -=I=- 0 (so what we can divide by h and get
the previous standard form). If Xo is not regular. it is called singular.
Indicial Equation, Indicating the Form of Solutions
We shall now explain the Frobenius method for solving (1). Multiplication of (1) by x 2
gives the more convenient form
2
x y"
(1')
+ xb(x)y' + c(x)y = O.
We first expand b(x) and c(x) in power series.
or we do nothing if b(x) and c(x) are polynomials. Then we differentiate (2) term by term,
finding
GC
y' (x)
=.L
(/11
+
r)amx'm+r-l
(m
+
r)(m
= xr -
1
[rao
'm=o
co
(2*)
-,",'(x) =
.L
+
r -
I ) amx'm+r-2
m~O
By inserting all these series into (1') we readily obtain
(3)
+
(r
+ l)alx + ... ]
184
CHAP. 5
Series Solutions of ODEs. Special Functions
We now equate the sum of the coefficients of each power XT, XT+l, XT+2, ••• to zero. This
yields a system of equations involving the unknown coefficients (1m- The equation
cOlTesponding to the power x" is
[r(r - I)
Since by assumption ao
+ bor + colao = o.
*- 0, the expression in the brackets [ ... ] must be zero. This gives
r(r - I) + bor + Co = O.
(4)
This important quadratic equation is called the indicial equation of the ODE (I ). Its role
is as follows.
The Frobenius method yields a basis of solutions. One of the two solutions will alway~
be of the form (2), where r is a root of (4). The other solution will be of a form indicated
by the indicial equation. There are three cases:
Case 1. Distinct roots not differing by an integer I. 2. 3.....
Case 2. A double root.
Case 3. Roots differing by an integer I. 2, 3.....
Cases I and 2 are not unexpected because of the Euler-Cauchy equation (Sec. 2.5), the
simplest ODE of the form (1). Case I includes complex conjugate roots r 1 and r2 = rl
because rl - r2 = rl - rl = 2i 1m /"1 is imaginary. so it cannot be a real integer. The
form of a basis will be given in Theorem 2 (which is proved in App. 4). without a general
theory of convergence, but convergence of the occurring series can he tested in each
individual case as usuaL Note that in Case 2 we must have a logarithm, whereas in Ca'>e
3 we mayor may 110t.
THEOREM 2
Frobenius Method. Basis of Solutions. Three Cases
Suppose that the ODE (1) satisfies the assumptions in Theorem I. LRt /"1 and r2 be
the roots of the indicial equation (4). Then we have the following three cases.
Case 1. Distinct Roots Not Differing by all Integer. A basis is
(5)
al1d
(6)
with coefficients obtained successivelyfrom (3) with r = rl and r = r2, respectivel)'.
Case 2. Double Root rl = r2 = r. A basis is
(7)
[r = ~(1
- bo)]
(of the .\£lme general form as before) and
(8)
(x> 0).
SEC. 5.4
Frobenius Method
185
Case 3. Roots Differing by an Integer. A basis is
(9)
(of the same generalfoml as before) and
(10)
>
where the roots are so denoted that rl - r2
0 and k may tum out to be zero.
Typical Applications
Technically, the Frobenius method is similar to the power series method, once the roots
of the indicial equation have been determined. However, (5)-00) merely indicate the
general form of a basis, and a second solution can often be obtained more rapidly by
reduction of order (Sec. 2.1).
E X AMP L E 1
Euler-Cauchy Equation, Illustrating Cases 1 and 2 and Case 3 without a Logarithm
For the Euler-Cauchy equation (Sec. 2.5)
(b o, Co constant)
substitution of y ~ x T gives the auxiliary equation
r(r -
which is the indicial equation [and y
=x
T
1)
+
bor
+
Co = 0,
is a very special form of (2)!]. For different roots rI, r2 we get a
basis YI = XTt,.I'2 = XT2, and for a double root r we get a basis XT, x T lnx. Accordingly, for this simple ODE,
Case 3 plays no extra role.
•
E X AMP L E 2
Illustration of Case 2 (Double Root)
Solve the ODE
x(x -
(11)
I)y"
+
(3x -
I)y'
+ Y=
O.
(This is a special hypergeometric equation, as we shall see in the problem set.)
Solution.
Writing (11) in the standard form (1), we see that it satisfies the assumptions in Theorem I. [What
are b(x) and c(x) in (II )?] By inserting (2) and its derivatives (2*) into (11) we obtain
co
L
(m
+
r)(m
+
r -
I)a",xm + r -
m=O
L
(m
+
r)(m
+
r -
l)a'mx'm+T-l
7n=O
(12)
m=O
7n=O
7n=O
The smallest power is xT-t, occurring in the second and the fourth series; by equating the sum of its coefficients
to zerO we have
[-r(r -
1) - r]ao
=
0,
Hence this indicial equation has the double root r = O.
thus
186
CHAP. 5
Series Solutions of ODEs. Special Functions
First Solutioll.
X
S
We insert this value
0 into (12) and equate the ~um of the coefficients of the power
I" =
to zero. obtaining
s(s -
thus {/s+1
=
(/s.
l)0s - (s
=
Hence 00
01
=
02
+
+
llSOs+1
3sos -
+ I )os+1 +
(s
Os =
0
= .... and by choosmg 00 = I we obtain the solution
I
m
L
0:::
\'I(X) =
x
(Ixl
= ---
I-x
1n=O
<
I).
Second Solution.
We get a second independent solution)"2 by the method of reduction of order (Sec. 2.1).
substituting)"2 = 11.\"1 and its derivatives into the equation. This leads to (9). Sec. 2.1. which we shall use in this
example. instead of starting reduction of order from scratch (as we shall do in the next example). In (9) of
Sec. 2.1 we have p = (3.1' - I )/(x2 - x). the coefficient of y' in (11) ill stalldard form. By partial fractions.
-J
pdt
=
-J
3x - I
.1'(.l-1)
dx
=
-J
(_2_
+ -.:.) dx =
x-I,
-2 In (x - I) - In x .
Hence (9), Sec. 2.1, becomes
)"1
Inx
In x,
11=
x
)"2 = 11.\'1 =
I -x .
and )"2 are shown in Fig. 106. These functions are linearly independent and thus form a basis on the interval
1 (as well as on I < x < X).
•
o< x <
Fig. 106.
E X AMP L E 3
Solutions in Example 2
Case 3, Second Solution with Logarithmic Term
Solve the ODE
(x 2 -
(13)
Solution.
t)y" -
t)-'
2
-
o.
=
Substituting (2) and (2*) into (13), we have
cc
00
(x
+ )"
x)
L
(111
+
r)(111
+
r -
1){/ m x 'llt+T-2 -
t
m 0
x
L
(111
+
rJo",xm +,'- l
+
0
7ft =
L
omx7lt+T =
O.
TTL=O
We now take x 2 , x. and x inside the summation~ and collect all tenns with power x"'+r and simplify algebraically,
'XC
L
+
(m
r -
1n=O
(/II
+
1")(/11
+
r -
l)lI m x",+r-l
= O.
nz,=O
In the first scnes We set
/II
= S and in the second
oc
(14)
L
l)2omxm+r -
L
s=o
III
=
S
+ L thus s =
/II -
+
r)lI S +1.t'H" =
1. Then
x
(s
+
I" -
1)2{/sxs+r -
L
s=-l
(s
r
+
I)(s
+
O.
SEC. 5.4
Frobenius Method
187
The lowest power is x r -
1
=
(take s
-I in the second series) and gives the indicial equation
r(r -
The roots are
/"1
o.
1) =
= 1 and r2 = U. They differ by an integer. This is Case 3.
First Solution.
From (14) with r = rl = I we have
L
1·,211s -
+ 21(s + IllIs +ljxS + 1 = O.
(s
s~o
This gives the recurrence relation
as+l =
Hence
O.
"1 =
{/2
+
(s
= 0, ... successively. Taking
2)(s
+
~ 1, we get
"0
(s = 0, I, ., ').
1) as
a."
a first solution
Second Solution.
Y~ = xu"
+
Applying reduction of order (Sec. 2.1), we substitute Y2 =
2u' into the ODE, obtaining
(x 2
xu drops
Olll.
-
XJlXll"
+
2u') - x(x,,' + ul
+ XII
=
1'1
Ylll
= ./lao = x.
=
Xli,
y~ =
Xll'
+ u and
O.
Division by x and simplification give
(x 2 - x)u"
+
(x - 2)11' =
O.
From this. using partial fractions and integrating (taking the integration constant zero). we get
u"
2
1/
x
,
+
~
I
x - I
Inu'=ln ~.
I
- I
Taking exponents and integrating (again taking the integration constant zero), we obtain
1/
,
=
x
X
1/ =
2 •
Inx +
)'2 = XI/ =
x
x Inx + 1.
Yl and ."2 are linearly independent. and."2 has a logarithmic term. Hence."l and."2 constitute a basis of solutions
•
for positive .t.
The Frobenius method solves the hypergeometric equation. whose solutions include
many known functions as special cases (see the problem set). In the next section we use
the method for solving Bessel's equation.
BASIS OF SOLUTIONS BY THE
FROBENIUS METHOD
Find a basis of solutions. Try to identify the series as
expansions of known functions. (Show the details of your
work.)
1. xy" + 2y' - xy = 0
2. (x + 2)2)''' - 2)' = 0
11-171
3. xv" + 51" + xy = 0
4. 2xy" + (3 - 4.1.'lY' + (2x - 3)y
2
5. x )"" + -1-.1:\" + (x + 2»)"
6. 4.1.')," + 2/ + y = 0
7. (x
2
=
0
x/' +
=
0
(x + 1»), = 0
+ 2x 3 ),' + (x 2 - 2)y = 0
11. (x 2 + f)Y" + (4x + 2»),' + 2)" = 0
12. x 2 y" + 6xy' + (4x 2 + 6)y = 0
13. 2x)"" - (8x - 1)y' + (8x - 2).\'
0
14• .1.'y" + y' - xy = 0
10.
(2x
+ 1)/ +
.1'2)'''
15. (x - 4)2)''' - (x - 4)y' - 35y
0
+ 3)2)''' - 9(x + 3»),' + 25y
8. xy" - }'
9.
16. x 2 y"
o
17. v"
+
+
4xy' - (x 2
(x -
6»)' = 0
-
2)y = 0
o
CHAP. 5
188
Series Solutions of ODEs. Special Functions
18. TEAM PROJECT. Hypergeometric Equation,
Series, and Function. Gauss' 5 hypergeometric ODE5
is
(15)
+
TO - x)}""
[e - (0
+
b
+
l)x]/ - aby = O.
Here. a. b, e are constants. This ODE is of the form
P2y" + Ply' + Po)" = 0, where P2' PI' Po are
polynomials of degree 2, 1, 0, respectively. These
polynomials are written so that the series solution takes
a most practical form. namely.
In (1
+ --
heX) = ]
I! e
x
+
ala
+
+
I)b(b
I)
2! e(e + I)
I, 2;
= xF(l.
1 + T
In - - ' = 2tF(.12' I ,
I -x
-x),
.3.
2' .T2) .
Find more such relations from the literature on special
functions.
(d) Second solution. Show that for /"2 = I - c the
Frobenius method yields the following solution (where
e =fo 2.3.4.... ):
\'2(.\") = x ] -c I
(
a - c
+
x2
+
I)(h - c
+
I)
I! (-c + 2)
07}
+ Ca - c +
ab
+ x)
l)(a - c + 2}Ch - c + I)Ch - c
2! (-c + 2)(-c + 3)
x
+ 2)
x2
+ .. -).
(16)
+
a(a
+
I)(a
+
3! e(e
+ 1)(b +
+ 1)(e + 2)
2)b(b
2)
T3
+ ....
This series is called the hypergeometric series. Its sum
By choosing specific values of 0, b. e we can obtain
an incredibly large number of special functions as
solutions of (15) [see the small sample of elementary
functions in part (c)]. This accounts for the importance
of (15).
(a) Hypergeometric series and function. Show that
the indicial equation of (15) has the roots /"1 = 0 and
/"2 = I-c. Show that for /"1 = 0 the Frobenius method
gives (16). Motivate the name for (16) by showing that
1
F(1, I. I; x) = HI. b, b; x) = F(a. 1. a; x) = - -
I - x
(b) Convergence. For what 0 or b will (16) reduce to
a polynomial? Show that for any other a, b. e
(e =fo 0, - I, - 2, ... ) the series (16) converges when
Ixl < 1.
(I
-I-
(I - x)n = I - 1IxF(1 -
11,
1.2: x).
arctan x = x F(!, 1.~; -x2 ).
arcsin x = x F(!,
!, ~: x 2 ).
I, b - e
+
1,2 - e;x).
(e) On the generality of the hypergeometric
equation. Show that
(18)
(12
+
At
+ B)y +
(Ct
+ D»)' + K)'
= 0
with .,. = dyldt. etc.. constant A, B. C. D. K. and
+ At + B = (t - tl)(t - t 2), tl =fo t 2, can be reduced
to the hypergeometric equation with independent
variable
t2
x=
and parameters related by Ct] + D = -e(t2 - tl),
e = a + b + I, K = abo From this you see that (15)
is a "normalized fonn" of the more general (18) and
that various cases of (18) can thus be solved in terms
of hypergeometric functions.
119-241
HYPERGEOMETRIC EQUATIONS
Find a general solution in terms of hypergeometric
functions.
19. x(\ - x»)'''
x)n = F( -11, b. b; -x),
+
)'2(X) = TI-CF(a - e
.\'I(X) is called the hypergeometric function and is
denoted by F(a, b, e; x). Here. e =fo 0, -I, -2.....
(c) Special cases. Show that
Show that
+
(! -
2x)), , -
+
+ h' +
!y
= 0
20. 2x(l - x»," - (1
6x)y' - 2y = 0
21. x(1 - x),,"
2.\' = 0
+
t)y
23. 2(£2 -
+ t)' - )' = 0
5t + 6)5; + (2t - 3).}·
24. 4(t 2
3t
22. 3[( I
-
+ 2)5: - 2." + Y
8y = 0
o
5 CARL FRIEDRICH GAUSS (1777-1855 J. great German mathemmician. He already made the first of his great
discoveries as a student at Helmstedt and Gottingen. In 1807 he became a professor and director of the Observatory
at Giittingen. His work was of basic importance in algebra. number theory, differential equations. differential
geometry. non-Euclidean geometry. complex analysis. numeric analysis. a~trollomy. geodesy. electromagnetism.
and theoretical mechanics. He also paved the way for a general and systematic use of complex numbers.
SEC. 5.5
Bessel's Equation. Bessel Functions},,(x)
5.5
Bessel's Equation. Bessel Functions Jv (x)
189
One of the most important ODEs in applied mathematics in Bessel's equation,6
(1)
Its diverse applications range from electric fields to heat conduction and vibrations (see
Sec. 12.9). It often appears when a problem shows cylindrical symmetry (just as Legendre's
equation may appear in cases of spherical symmetry). The parameter v in (1) is a given
number. We assume that v is real and nonnegative.
Bessel's equation can be solved by the Frobenius method, as we mentioned at the
beginning of the preceding section, where the equation is written in standard form
(obtained by dividing 0) by x 2 ). Accordingly, we substitute the series
(2)
y(x)
~ amxm + r
=
(ao
-=/=-
0)
7n~O
with undetermined coefficients and its derivatives into (1). This gives
:x:
00
We equate the sum of the coefficients of x s + r to zero. Note that this power X S + T
corresponds to 111 = s in the first, second, and fourth series. and to 111 = S - 2 in the
third series. Hence for s = 0 and s = I, the third series does not contribute since
111 ~ O. For s = 2, 3, ... all four series contribute, so that we get a general formula for
all these s. We find
(3)
(a)
r(r - l)ao + rao -
(b)
+
(r
l)ral
+
+
(r
= 0
V2ao
1)(/1 -
2
V (/1
= 0
(s = 0)
(8
=
1)
(8 = 2,3, ... ).
From (3a) we obtain the indicial equation by dropping ao,
(4)
The roots are r I
(r
v(~
0) and
1"2
+
v)(r -
v)
= O.
= -v.
6FRIEDRICH WILHELM BESSEL (1784-1846l. German astronomer and mathematician. studied astronomy
on his own in his spare time as an apprentice of a trade company and finally became director of the new Konigsberg
Observatory.
Formulas on Bessel functions are contained in Ref. [GRI] and the standard treatise [AB].
190
CHAP. 5
Series Solutions of ODEs. Special Functions
Coefficient Recursion for r = rl = v. For r = v, Eq. (3b) reduces to (2v + I)al = 0.
Hence al = since v ~ 0. Substituring r = v in (3c) and combining the three terms
containing as gives simply
°
(5)
(s
+
2v)sas
+
lIs-2
= 0.
°
Since al =
and v ~ 0, it follows from (5) that a3 = 0, a5 = 0, .... Hence we have
to deal only with even-numbered coefficients as with s = 2m. For s = 2m. Eq. (5) becomes
+
(2m
+
2v)2ma2rn
lI2m-2
= O.
Solving for a2m gives the recursion formula
(6)
- - ; ; : 2 - - - - a2m-2,
2 m(v
From (6) we can now determine
a2' £14' • • •
+
m
= I, 2, ....
III
=
m)
successively. This gives
and so on, and in general
(7)
II
2m
=
( -l)'mao
--;;:-------~------
2211117l! (V
+
l)(V
+
2) ... (v
+
Ill)
1, 2, ....
Bessel Functions In(x) For Integer v = n
Illteger vailles oIv are denoted by 11. This is standard. For v =
(8)
=
(/2
111
II
the relation 0) becomes
(-l)'mao
--;;:--------------
22 'mm!
(11
+
l)(n
+
2) . . . (n
+
m)
m
= 1,2,···.
ao is still arbitrary, so that the series (2) with these coefficients would contain this arbitrary
factor ao. This would be a highly impractical situation for developing formulas or
computing values of this new function. Accordingly, we have to make a choice. ao = 1
would be possible, but more practical turns out to be
(9)
because then n!(11
(10)
lIo
+ 1) ...
(n
=
2"n! .
+ m) = (Ill +
a2m
=
n)!
in (8), so that (8) simply becomes
(-l)m
""""="2----'-----2 1n+n m! (n
1Il)!
+
m = 1,2,···.
SEC. 5.5
Bessel's Equation. Bessel Functions },,(x)
191
This simplicity of the denominator of (10) partially motivates the choice (9). With these
coefficients and rl = v = II we get from (2) a particular solution of (I), denoted by in(x)
and given by
(11)
i,,(x) is called the Bessel function of the first kind of order 11. The series (II) converges
for all x. as the ratio test shows. In fact. it converges very rapidly because of the factorials
in the denominator.
E X AMP L E 1
Bessel Functions lo(x) and l,(x)
For
11
=
0 we obtain from (I I) the Bessel function of order 0
(12)
./o(x) =
L-
x6
26(3!)2
(_I)'nJ'x 2m
2211t(m!)2
+ - ."
1l1,-O
which looks similar to a cosine (Fig. 107). For
hex)
(131
L-
=
1n=O
(-I ),,'x2111 + 1
2m
1 + 11ll! (Ill + l)!
II
= I
we obtain the Bessel function of order I
X
2
:3
2:3 1!2!
+
x5
5
2 1!3!
x7
7
2 3!4!
+
-
...
which looks similar to a sine (Fig. 107). But the zeros of these functions are not completely regularly spaced
2 2
(see also Table Al in App. 5) and the height of the "waves" decreases with increasing x. Heuristically. n /x
in (I) in standard form [( I) divided by .\'21 is zero (if It = 0) or small in absolute \alue for large x. and so is
\·'/x. so that then Besser s equation comes close to / ' + Y = O. the equation of cos y and sin y; also / Ix acts
as a "damping term."' in part responsible for the decrea~e in height. One can show that for large x.
(141
./,,(x) -
[2 cos (x
~ -:;;:;:-
-
2itT. - 4T.)
where - is read "asymptotically equal"' and means thatfor.fhed II the quotient of the two
aSX---i>
~ide~ appruache~
I
x.
Formula (14) is surprisingly accurate even for smaller x (> 0). For instance. it will give you good starting
values in a computer program for the basic task of computing Leros. For example. for the first three zeros of ./0
you obtain the values 2.356 (2.405 exact to 3 decimals. error 0.049). 5.498 (5.520. error 0.022), 8.639 (8.654.
error 0.015), etc.
•
0.5
/
""
/
/
/
""
/
O~---L----~~-L----~--~~L-~--,,~/~--~--~~--~lO~--~--~/~/~--x
.....
Fig. 107.
_--/
,,/
Bessel functions of the first kind Jo and JI
192
CHAP. 5 Series Solutions of ODEs. Special Functions
Bessel Functions Jv{x) for any
JJ ::>
o. Gamma Function
We now extend our discussion from integer v = 11 to any v ~ O. All we need is an
extension of the factorials in (9) and (11) to any v. This is done by the gamma function
[( v) defined by the integral
(15)
(v> 0).
By integration by parts we obtain
The first expression on the right is zero. The integral on the right is f( v). This yields the
basic functional relation
(16)
rev
+
l)
= v rev).
Now by (I5)
From this and (16) we obtain successively r(2) = f(J)
and in general
nil +
(17)
1)
= I!, [(3) = 2f(2)
= n!
(n
=
2!, ...
= O. L .. ').
This shows the the gamma function does in fact generalize the factorial function.
Now in (9) we had ao = 1I(2nn!). This is 1!(271
+ 1)) by (17). It suggests to choose,
for any v,
rell
(18)
Then (7) becomes
22mm! (v
+
l)(v
+ 2)
... (v
+ m)2'T(v +
1) .
But (16) gives in the denominator
(v
+
l)[(v
+
IJ
= rev + 2),
(v
+ 2)nv + 2) =
rev
and so on, so that
(v
+
1)(11
+ 2)
... (v
+ 171) rev +
I)
= rev + 11l +
1).
+
3)
SEC. 5.5
Bessel's Equation. Bessel Functions JAx)
193
Hence because of our (standard!) choice (18) of ao the coefficients (7) simply are
(-1)=
(19)
a2nz
=
2
2 m+"m!
With these coefficients and r
by i,.(x) and given by
= r) =
(20)
=
rev + m +
1)
v we get from (2) a particular solution of (1), denoted
00
i,.(x)
x"L
m=O
22m +"I1l!
rev + m + 1)
i,,(x) is called the Bessel function of the first kind of order v. The series (20) converges
for all x, as one can verify by the ratio test.
General Solution for Noninteger v. Solution )-'/
For a general solution, in addition to I,. we need a ~econd linearly independent solution.
For v not an integer this is easy. Replacing v by - v in (20), we have
(21)
Since Bessel's equation involves v 2 , the functions i" and i_,. are solutions of the
equation for the same v. If v is not an integer, they are linearly independent, because
the first term in (20) and the first term in (21) are finite nonzero multiples of x" and
x-", respectively. x = 0 must be excluded in (21) because of the factor x-v (with v> 0).
This gives
THEOREM 1
General Solution of Bessel's Equation
If v is not an integer. a general solution of Bessel's equation for all x
-=I=-
0 is
(22)
But if v is an integer, then (22) is not a general solution because of linear dependence:
THEOREM 2
Linear Dependence of Bessel Functionsln andl_ n
For integer v = n the Bessel functions in(x) and i_n(x) are linearly dependent,
because
(23)
(n
=
1,2, .. ').
CHAP. 5 Series Solutions of ODEs. Special Functions
194
PROOF
We use (21) and let v approach a positive integer n. Then the gamma functions in the
coefficients of the first n terms become infinite (see Fig. 552 in App. A3.1). the
coefficients become zero. and the summation starts with rn = II. Since in this case
rem - n + 1) = (m - Il)! by (17). we obtain
(m
= 11 + s).
The last series represents (-I)nIn{x), as you can see from (11) with m replaced by s. This
completes the proof.
•
A general solution for integer
interesting ideas.
11
will be given in the next section, based on some further
Discovery of Properties From Series
Bessel functions are a model case for showing how to discover properties and relations of
functions from series by which they are defined. Bessel functions satisfy an incredibly large
number of relationships-look at Ref. [AI3] in App. I; also, find out what your CAS
knows. In Theorem 3 we shall discuss four formulas that are backbones in applications.
THEOREM 3
Derivatives, Recursions
The derivative of l,,(x) with respect to x can be expressed by lv_lex) or Iv+I(X) by
the fOl1llu/lls
(a)
[xVI,,(x)]'
(b)
[x-vI,,(x)]'
(24)
= xVJ,,_I(X)
= -X-vI,,+I(X).
Furthermore. J,,(x) and its derivative satisfy the recurrence relations
lv+l(x)
2v
= -lv(x)
lv_leX)
(d)
lv_leX) - Iv+I(X) = 2l~(x).
(24)
PROOF
+
(c)
x
(a) We multiply (20) by xl' and take X2v under the summation sign. Then we have
We now differentiate this, cancel a factor 2, pull X 2v- 1 out, and use the functional
relationship n v + III + 1) = (v + l7l)re v + m ) [see (16)]. Then (20) with v-I instead
of v shows that we obtain the right side of (24a). Indeed,
SEC. 5.5
195
Bessel's Equation. Bessel Functions J Jx)
(b) Similarly, we multiply (20) by x-", so that x" in (20) cancels. Then we differentiate,
cancel 2111, and use Ill! = m(m - I)!. This gives, with III = s + I,
Equation (20) with v + I instead of v and s instead of m shows that the expression on
the right is -x- vJ v + 1 (x). This proves (24b).
(e), (d) We perform the differentiation in (24a). Then we do the same in (24b) and
multiply the result on both sides by x2v. This gives
(a*)
(b*)
vx,.-lJ ..
-vX,·-IJ v
+
x''''~
= x"J,,_1
+ xVJ:, =
-x"J v + 1'
Substracting (b*) from (a*) and dividing the result by x" gives (24c). Adding (a*) and
(b~') and dividing the result by xl' gives (24d).
•
E X AMP L E 2
Application of Theorem 3 in Evaluation and Integration
Formula (24c) can be used recursively in the form
for calculating Bessel functions of higher order from those of lower order. For instance, J 2 (x) = 2J1(.1')/.1' - JoC>:),
so that J 2 can be obtained from tables of J o and h (in App. 5 or, more accurately, in Ref. [GRI] in App. 1).
To illustrate how Theorem 3 helps in integration. we use (24b) with v = 3 integrated on both sides. This
evaluates. for instance. the integral
A table of J 3 (on p. 398 of Ref. [GR I]) or yom' CAS will give you
- A· 0.128943 + 0.019563 = 0.003445.
Yom CAS (or a hnman computer in precomputer times) obtains h from (24), first u~ing (24e) with v = 2,
that is, J 3 = 4.1'- 1J 2 - J 1 • then (24c) with v = I. thal is. J 2 = 2r:- 1h - J o. Together.
1=
x- 3 (4r:- 1 (2r- 1h - J o) - hI
I:
=
-A[2h(2) - 2Jo(2)
=
-AJ1 (2) + !Jo(2) + 7h(l) - 4JoO).
-
h(2)]
+ [8h(l) - 4Jo(l)
-
hill]
This is what you get, for instance. with Maple if you type int(·· '). And if you type evalf(int(··
0.00344544K in agreement with the result near the beginning of the example.
.», you obtain
•
In the theory of special functions it often happens that for certain values of a parameter
a higher function becomes elementary. We have seen this in the last problem set, and we
now show this for J r,.
CHAP. 5
196
THEOREM 4
Series Solutions of ODEs. Special Functions
Elementary}" for Half-Integer Order v
Besselfunctions J" of orders ±!, ±~, ±~, ... are elementary; the}' call be expressed
by fillitely many cosines and sines and powers of x. In particular,
(25)
PROOF
When
lJ
11/2(X)
(a)
=
=
J
2 sin x.
(b)
Ll/2(x)
7T"X
=
J
2 cos X.
7T"X
!, then (20) is
To simplify the denominator. we first write it out as a product AB. where
A
= 2mm! =
2m(2111 - 2)C2m - 4) ... 4·2
and [use (16)J
= (2m +
1)(2111 - I) ... 3· 1 • v.r;;
here we used
(26)
We see that the product of the two right sides of A and B is simply (2111
J 1/2 becomes
1 1 / 2 (x)
=
[f
-
x
~
7T"X m=O
(_I}lnx 2m+1
(2m
+
=
I)!
as claimed. Differentiation and the use of (24a) with
+
I )!v.r;, so that
[f.
-
sin x,
7T"X
lJ
=
! now gives
This proves (25b). From (25) follow further formulas successively by (24c), used as in
Example 2. This completes the proof.
•
E X AMP L E 3
Further Elementary Bessel Functions
From (24c) with v = ~ and v = -~ and (25) we obtain
L 3/ 2(X)
respectively, and so on.
= - -I
x
L 1/ 2(X) - h/2(X)
=-
[f
TTX
(cosx
--
x
+
sinx)
•
SEC 5.5
Bessel's Equation. Bessel Functions JJx)
197
We hope that our study has not only helped you to become acquainted with Bessel
functions but ha<; also convinced you that series can he quite useful in obtaining various
properties of the corresponding functions.
PROBL~ME5EE~E3~.35L-
_____
1. (Convergence) Show that the series in (I I) converges
121-281
for all x. Why is the convergence very rapid?
2. (Approximation) Show that for small Ixl we have
10 = I - 0.25x2 . From this compute 10(.'r) for
x = O. 0.1. 0.2..... 1.0 and determine the error by
using Table Al in App. 5 or your CAS.
Use the powerful formulas (24) to do Probs. 21-28. (Show
the details of your work.)
21. (Derivatives) Show that l~(x) = - 1 1 (x),
l~(x) = 10(x) - 11(X)/X, l~(x)
3. ("'Large" values) Using 04), compute 10lx) for
x = 1.0, 2.0. 3.0..... 8.0, determine the error by
Table Al or your CAS. and comment.
4. (Zeros) Compute the fIrst four positive zeros of 10 (.r)
and 1 1 (x) from (14). Determine the error and comment.
15-201
ODEs REDUCIBLE TO BESSEL'S
EQUATION
6. x 2 / ' + xy' + (x 2 - -W)Y
7. x 2)''' + xy' + !(x - JJ2)y
=
=
=
0
19. x 2 )""
20.
=
x "1" (x)
+
C,
0
CVx = z)
=
0
26. (Integration) Evaluate
integrate by parts.)
I
x- 11 4 (x) dx. (Use Prob. 25;
27. (Integration) Show that
Ix 2 10 (x) dx = x21 1 (x)
+ x10(x) -
Il o(X) dx. (The
last integral is nonelementary; tables exist, e.g. in Ref.
[A13J in App. L)
28. (Integration) Evaluate
I1
5 (X) dx.
29. (Elimination of first derivative) Show that y = ltV
with vex) = exp (-~ J p(x) dx) gives from the ODE
y" + p(x)y' + q(x)y = 0 the ODE
xl/4 u. X1l4 = ::.;)
17. 36x2,," + 18x\"
(y =. x 1l4 u, ix i/4
18. x 2 )'''
IX"l"-l(X) dx
lAx = ;::)
+ 1 )2y" + 2(2x + 1)/ + 16x( t + l)y
(2x + I = ::.;)
9. x/' - / + 4ry = 0 (\. = XII, 2t = z)
10. x 2 )"" + x.v' + !(x 2 - I)), = 0 lx = 2::.;)
11. xy" + (2v + l)y' + xy = 0 (y = x-Vu)
12. x 2)"" + xy' + 4(x 4 - JJ2)y = 0 (x 2 = z)
13 .... 2y" + xy' + 9(x 6 - v 2 »' = 0 (x 3 = z)
I-I. y" + (e 2x - ~)y = 0 (eX = .::.)
15. xy" + y = 0 l)' = Vx u. 2Vx = z)
16. 16x2 )''' + 8xy' + (x1/ 2 + :a)y = 0
=
13(x)l.
0
8. (2x
(y
= Ml 1(x) -
22. (Interlacing ofzeros) Using (24) and Rolle's theorem,
show that between two consecutive zeros of 10(x) there
is precisely one zero of 11 (x).
23. (Interlacing ofzeros) Using (24) and Rolle's theorem.
show that between any two consecutive positive zeros
of In(x) there i~ precisely one J:ero of l,,+I(X).
24. (Bessel's equation) Derive (I) from (24).
25. (Basic integral formulas) Show that
Using the indicated substitution~. find a general solution in
temlS of 1 v and 1 - v or indicate when this is not possible.
(This is just a sample of various ODEs reducible to Bessel's
equation. Some more follow in the next problem set. Show
the details of your work.)
5. (ODE "ith two parameters)
x 2)"" + xy' + (A 2x 2 - JJ2)y
APPLICATION OF (24): DERIVATIVES,
INTEGRALS
+
Vx V
=
z) .
=
0
+ xy' + Vx y = 0 (4t 1l4 = z)
+ !xy' + Vx y = 0 ly = x 2/5 u. 4X1l4 =
+ (I - 2v)xy' + v 2(x 2v + 1 - v 2 )y =
x 2/ '
(y = t"u,
XV
= ::.;)
.::.)
0
no longer containing the first derivative. Show that for
the Bessel equation the substitution is y = UX- 1I2 and
gives
(27)
198
CHAP. 5
Series Solutions of ODEs. Special Functions
30. (Elementary Bessel functions) Derive (25) in
Theorem 4 from (27).
31. CAS EXPERIMENT. Change of Coefficient. Find
and graph (on common axes) the solutions of
.v" + h- 1y' + Y =
O• .1'(0) = I.
y' (0)
=
(c) Conclude that possible frequencies wl27f are those
for which s = 2wv Llg is a zero of 10 , The
con-esponding solutions are called normal modes.
Figure 108 shows the first of them. What does the second
normal mode look like? The third? What is the frequency
(cycles/min) of a cable of length 2 m? Of length 10 m?
O.
for k = 0, 1, 2... '. 10 (or as far as you get useful
graphs). For what k do you get elementary functions?
Why? Try for noninteger k. particularly between 0 and
2. to see the continuous change of the curve. Describe
the change of the location of the zeros and of the
extrema as k increases from O. Can you interpret the
ODE as a model in mechanics, thereby explaining your
observations?
32. TEAM PROJECT. Modeling a Vibrating Cable
(Fig. 108). A flexible cable, chain, or rope of length L
and density (mass per unit length) p is fixed at the upper
end (x = 0) and allowed to make small vibrations
(small angles a in the horizontal displacement u(x. t),
t = time) in a vertical plane.
(a) Show the following. The weight of the cable below
a point x is W(x) = pg(L - x). The restoring force is
F(x) = W sin a = Wu.". U x = (JuliJx. The difference in
force between x and x + D.X is D.x (Wu~')x' Newton's
second law now gives
Equilibrium
position
Fig. 108.
33. CAS EXPERIMENT. Bessel Functions for Large x.
(a) Graph l,,(x) for 11 = 0, ... , 5 on common axes.
(b) Experiment with (14) for integer II. Using graphs.
find out from which x = Xn on the curves of (II) and
(14) practically coincide. How does Xn change with n?
(c) What happens in (b) if II = ::!::~? (Our usual
notation in this case would be v.)
(d) How does the en-or of (14) behave as function
of x for fixed II? [En-or = exact value minus
approximation (14).1
(e) Show from the graphs that 10 (x) has extrema where
1 1 (x) = O. Which formula proves this? Find further
relations between zeros and extrema .
(f) Raise and answer questions of your own. for
instance. on the zeros of 10 and 1 1 , How accurately are
they obtained from (14)?
p .1x II tt = .lx pg[(L - X)lI"Jx.
For the expected periodic motion
= .r(x) cos (wt + 8) the model of the problem
is the ODE
u(x, t)
(L - x).r" -
y' + A\
=
O.
(b) Transform this ODE to~; + ,1'-1" + Y = O.
dylds. s = 2A::,1I2. ::. = L - x. so that the
solution is
." =
y(x) = 10 (2wV(L - >;)Ig).
5.6
Vibrating cable in Team Project 32
Bessel Functions of the Second Kind Yv(x)
From the last section we know that I" and I_v fonn a basis of ~olution<; of Bessel's
equation, provided v is not an integer. But when v is an integer, these two solutions are
linearly dependent on any interval (see Theorem 2 in Sec. 5.5). Hence to have a general
solution also when v = 11 is an integer, we need a second linearly independent solution
besides In. This solution is called a Bessel function of the second kind and is denoted
by Yn . We shall now derive such a solution. beginning with the case II = O.
n = 0:
When
(1)
II
Bessel Function of the Second Kind Yo(x)
= 0, Bessel's equation can be written
xy"
+ y' + .\}' =
O.
SEC 5.6
199
Bessel Functions of the Second Kind Y,.(x)
Then the indicial equation (4) in Sec. 5.5 has a double root r = O. This is Ca<;e 2 in
Sec. 5.4. In this case we first have only one solution. 10(x). From (8) in Sec. 5.4 we see
that the desired second solution must be of the form
00
(2)
=
Y2(X)
+ "L
10(x) In x
Amx7n.
m=1
We substitute .'"2 and its derivatives
I
I
.'"2 = 10 In x
+
""
21~
10 Inx + -x
x
111A",x"'-
1
m=1
10
)'2 =
+ L..
~
10
x2
+ "L
'Xc
111 (m
I )A m x m -
-
2
Jll,=l
into (1). Then the sum of the three logarithmic terms x 1~ In x, 1 ~ In x, and xl0 In x is Lero
because 10 is a solution of (l). The terms - 10 Ix and 10 lx (from xy" and :v') cancel. Hence
we are left with
x
21~
x
+ "L
m(m -
I)Am x m -
1
+ "L
x
mAm·\·m-I
m=l
111=1
"L
+
Amxm+I
= O.
==1
Addition of the first and second series gives Lm2Amxm-l. The power series of 1~(x) is
obtained from 1I2) in Sec. 5.5 and the use of m!/m = (111 - I)! in the form
00
="L
In=l
Together with Lm 2A",xm - 1 and LAmxm+l this gives
(3*)
"L..
m=1
( -1 )nlx2m-l
2 ,-2
2 n Ill!
(m -
I)!
+"
L..
'Xc
111 2A mX"In-I
==1
+"
L..A .,..,..':.=+1 = O.
m=1
First, we show that the Am with odd subscripts are all zero. The power X O occurs only in
the second series. with coefficient AI' Hence Al = O. Next, we consider the even powers
X2s. The first series contains none. In the second serie", m - 1 = 25 gives the term
(25 + 1)2A2s+1X2s. In the third series. m + 1 = 25. Hence by equating the sum of the
coefficients of x 2s to zero we have
5
Since Al = 0, we thus obtain A3 = O. A5 = O.... , successively.
We now equate the sum of the coefficients of X 2s + I to zero. For s
-1
+ 4A2 =
0,
= L. 2 .....
= 0 this gives
thus
For the other values of s we have in the first series in (3*) 2111 - I = 2s + 1, hence
m = s + 1, in the second m - 1 = 2s + 1, and in the third 111 + 1 = 2s + I. We thus obtain
(_l)s+1
2s
1
2 (s + )! s!
+ (2s +
2)2A 2s + 2
+ A2s =
O.
200
CHAP. 5
For s
Series Solutions of ODEs. Special Functions
= J tills yields
3
128
thus
and in general
(3)
A 2m
=
(-I )m-l (
22m(m!)2
1
2
+
= ]+
2
1
+
1
3
J )
+ ... + -
III
.
m = 1,2,···.
Using the short notations
(4)
and inserting
hm
(4)
and Al
= A3 = . . . =
0
+ ... +
into
(2).
m
m
= 2.3.···
we obtain the result
(5)
= Jo(x) lnx +
1 2
4x
_3_
128
X4
+ _1_1_ x 6
13824
+
Since 10 and )"2 are linearly independent functions, they fonn a basis of (I) for x > o.
Of course, another basis is obtained if we replace )"2 by an independent particular solution
of the form a(Y2 + b10), where a (of:. 0) and b are constants. It is customary to choose
a = 2/7T and b = y - ]n 2. where the number y = 0.577 215 664 90 ... is the so-called
Euler constant, which is defined as the limit of
1
+
2
+ ... +
s
-
In s
as s approaches infinity. The standard particular solution thus obtained is called the Bessel
function of the second kind of order ~ero (Fig. 109) or Neumann's function of order
zero and is denoted by Yo(x). Thus [see (4)]
(6)
For small x > 0 the function Yo(x) behaves about like In x (see Fig. 109, why?), and
~ - 'XJ as x ~ o.
Yo(x)
Bessel Functions of the Second Kind Yn(x)
For v = 11 = 1, 2, ... a second solution can be obtained by manipulations similar to those
for 11 = 0, st<uting from (10), Sec 5.4. It turns out that in these cases the solution also
contains a logarithmic term.
The situation is not yet completely satisfactory, because the second solution is defined
differently, depending on whether the order v is an integer or not. To provide uniformity
SEC. 5.6
Bessel Functions of the Second Kind Y•. (x)
201
of fonnalism, it is desirable to adopt a form of the second solution that is valid for all
values of the order. For this reason we introduce a standard second solution Y,,(x) defined
for all v by the formula
(a)
Y,.(X)
=
Sill
(7)
V7T
[1,.(x) cos V7T - i_.,(x)]
Yn(X) = lim VAx).
(b)
,,~n
This function is called the Bessel function of the second kind of order vor Neumann's
function 7 of order v. Figure 109 shows Yo(x) and YI(X),
Let LIS show that i,. and YI' are indeed linearly independent for all v (and x > 0).
For non integer order v. the function Y,lx) is evidently a solution of Bessel's equation
because i,.(x) and 1 -,,(x) are solutions of that equation. Since for those v the solutions 1"
and i_,. are linearly independent and Y,. involves I_v, the functions i,. and Y" are linearly
independent. Furthermore, it can be shown that the limit in (7b) exists ,md Yn is a solution
of Bessel's equation for integer order; see Ref. [A13] in App. 1. We shall see that the
series development of Yn(x) contains a logarithmic term. Hence i,lr) and Yn(x) are linearly
independent solutions of Bessel's equation. The series development of Yn(x) can be
obtained if we inseI1 the series (20) and (21), Sec. 5.5. for i,.(x) and L,.(x) into (7a) and
then let v approach 11; for details see Ref. [A13]. The result is
Y,,(x)
(x
2
)
xn ~
= - 1n(x) In -2 + 'Y + 7T
(8)
_x
-n n - l
'"
~
7T
where x > 0,
h
m
11
m~O
(n - m -
1
2
Fig. 109.
111
(-I)m-l(h m
2
2
I)!
m+1l-17l !
-----;::----- X
2m n
2 - m!
= O. 1. .... and [as in (4)]
=1+-+ .. ·+
~
7T m~O
ho
hm+n
+ h m+ n )
(Ill
+ n)!
2
1ll
= O. hI = L
= 1+
1
2.
+ ... +
111
+ 11
Bessel functions of the second kind Yo and Y,.
(For a small table, see App. 5.)
7CARL NEUMANN (1832-\9251. German mathematician and physicist. His work on potential theory sparked
the development in the field of integral equations by VITO VOLTERRA (1860-1940) of Rome. ERIC IVAR
FREDHOLM (I 866--19D) of Stod.holm. and DAVID HILBERT (1862-1943) of Giittingen (see the footnote
in Sec. 7.91.
The solutions Y,.(X) are sometimes denoted by N,.(x); in Ref. [A13] they are called Weber's functions; Euler',
constant in (6) is often denoted by C or In 1'.
101
CHAP. 5
Series Solutions of ODEs. Special Functions
For n = 0 the last sum in (8) is to be replaced by 0 [giving agreement with (6)].
Furthermore, it can be shown that
Our main result may now be formulated as follows.
THEOREM 1
General Solution of Bessel's Equation
A general solution of Bessel's equatimz for all values of v (and x > 0) is
(9)
We finally mention that there is a practical need for solutions of Bessel's equation that
are complex for real values of x. For this purpose the solutions
(10)
H~~\>;;)
= J v(X) +
iY,,(x)
H~~)(x)
=
iY,,(x)
Jv(x) -
are frequently used. These linearly independent functions are called Bessel functions of
the third kind of order v or first alld second Hankel functions B of order v.
This finishes our discussion on Bessel functions, except for their "orthogonality," which
we explain in Sec. 5.7. Applications to vibrations follow in Sec. 12.9.
11-10 1
SOME FURTHER ODEs REDUCIBLE TO
BESSEL'S EQUATIONS
(See also Sec. 5.5.)
Using the indicated substinl!ions, find a general solution in
terms of J v and Y •. Indicate whether you could also use J- v
instead of Y v ' (Show the details of your work.)
1. x2y" + x)" + (x 2 - 25)y = 0
2
2
2. x -,," + x/ + (9x - ~)y = 0 (3x = .:::)
3. 4xy" + 4/ + y = 0 (~=.:::)
4. xy" + y' + 36)" = 0 (\ 2~ = z)
5. x 2 y" + xy' + (4 X 4 - 16)y = 0 (x 2 = z)
6. x 2 -,," + x/ + (x 6 - I)." = 0 (~x3 = z)
7. xy" + 11/ + xy = 0 (y = x- 5 ,,)
8. y" + -1-x 2 y = 0 (y = u~. x 2 = z)
9. x 2 y" - 5xy' + 9(x 6 - 8)y = 0 (y = x 3 u, x 3 = z)
10. xy" + 7/ + 41·Y = 0 (y = x- 3 u. 2x = z)
11. (Hankel functions) Show that the Hankel functions ( 10)
form a basis of solutions of Bessel's equation for any v.
12. CAS EXPERIMENT. Bessel Functions for Large x.
It can be shown that for large x.
(11)
Yn(x) -
v'2/( 7TX) sin (x -
! 1l7T -
~7T)
with - defined as in (14) of Sec. 5.5.
(a) Graph Yn(x) for 11 = O.... , 5 on common axes.
Are there relations between zeros of one function and
extrema of another? For what functions?
(b) Find out from graphs from which x = Xn on
the curves of (8) and (11) (both obtained from your
CAS) practically coincide. How does Xn change
with 11?
(c) Calculate the first ten Leros X m ' In = I, ... , 10,
of Yo(x) from your CAS and from (II). How does the
error behave as 171 increases?
(d) Do (c) for Yl(X) and Y2(x). How do the errors
compare to those in (c)?
BHERMANN HANKEL (1839-1873). German mathematician.
SEC. 5.7
Sturm-Liouville Problems. Orthogonal Functions
203
13. Modified Bessel functions of the first kind of order P
are defined by Iv(x) = i-vJ,,(ix), i = \1"="1. Show that
Iv satisfies the ODE
14. (Modified Bessel functions I.,) Show that I,,(x) is real
for all real x (and real v), 1v (x) "* 0 for all real x "* 0,
and Ln(x) = In(x). where n is any integer.
x 2 y" + xy' - (x 2 + v 2 )y = 0
15. Modified Bessel functions of the third kind (sometimes
called of the second kind) are defined by the formula (14)
below. Show that they satisfy the ODE (12).
(12)
and has the representation
cc
(13)
[,Jx) =
x2m.+v
L
22m+vm!
rem + v +
(14)
I)
KJI:) =
711=-0
5.7
7r
[Lv(.\) - IJr)]
.
2 sm
V7r
Sturm-Liouville Problems.
Orthogonal Functions
So far we have considered initial value problems. We recall from Sec. 2.1 that such a problem
consists of an ODE, say, of second order, and initial conditions .1'(xo) = Ko, y' (xo) = KI
referring to the same point (initial point) x = Xo. We now turn to boundary value problems.
A boundary value problem consists of an ODE and given boundary conditions referring
to the two boundary points (endpoints) x = a and x = b of a given interval a ~ x ~ b.
To solve such a problem means to find a solution of the ODE on the interval a ~ x ~ b
satisfying the boundary conditions.
We shall see that Legendre's, Bessel's, and other ODEs of importance in engineering
can be written as a Sturm-Liouville equation
(1)
[p(x)y']'
+
[q(x)
+
Ar(x)]y = 0
involving a parameter A. The boundary value problem consisting of an ODE (1) and given
Sturm-Liouville boundary conditions
+
(a)
kIy(a)
(b)
IIy(b)
(2)
k2 y' (a) = 0
+ 12y'(b) =
0
is called a Sturm-Liouville problem. 9 We shall see further that these problems lead to
useful series developments in terms of particular solutions of (1), (2). Crucial in this
connection is orthogonality to be discussed later in this section.
In (1) we make the assumptions thatp, q, r, andp' are continuous on a ~ x ~ b, and
rex)
>0
(a
~
x
~
b).
In (2) we assume that kI , k2 are given constants, not both zero, and so are II, 12, not both
zero.
9JACQUES CHARLES FRAN<;:OIS STURM (1803-1855), was born and studied in Switzerland and then
moved to Paris, where he later became the successor of Poisson in the chair of mechanics at the Sorbonne (the
University of Paris).
JOSEPH LIOUVILLE (1809-1882), French mathematician and professor in Paris, contributed to various
fields in mathematics and is particularly known by his important work in complex analysis (Liouville's theorem;
Sec. 14.4), special functions, differential geometry, and number theory.
204
E X AMP L E 1
CHAP. 5 Series Solutions of ODEs. Special Functions
Legendre's and Bessel's Equations are Sturm-Liouville Equations
Legendre's equation
(I -
x2)y" -
2.\)"'
+ "(,, +
I)y = 0 may be written
A=
11(11
+
I).
This is (1) With P = I - x 2 , q = O. and r = 1.
In Bessel's equation
.,. =
dyldx.
etc.
as a model in physics or elsewhere. one often likes to have another parameter k in addition to II. For this reason
we set x = h. Then by the chain rule .,. = dyld.y = «(~,"ldx) drldx = y'lk. ;.' = y"lk 2 . In the first two lerms. k 2
and k drop out and we get
Division by x gives the Sturm-Liouville equation
[xy'l' +
This is (I) with p = x. q =
_/12/x,
+
A.\}'
0
=
•
and r = x.
Eigenfunctions, Eigenvalues
Clearly, y == 0 is a solution-the "trivial solution"-for any A because (I) is homogeneous
and (2) has zeros on the right. This is of no interest. We want to find eigenfunctions y(x),
that is, solutions of (l) satisfying (2) without being identically zero. We call a number A
for which an eigenfunction exists an eigenvalue of the Sturm-Liouville problem (1), (2).
E X AMP L E 2
Trigonometric Functions as Eigenfunctions. Vibrating String
Find the eigenvalues and eigenfunctions of the Sturm-Liouville problem
y" + Ay = O.
(3)
.1'(0)
= 0,
.1'( 77)
= O.
This problem arises. for instance. if an elastic SIring (a violin sIring, for example) is slrelched a little and then
fixed at its ends x = 0 and x = 77 and allowed to vibrate. Then y(x) is the "space function'" of the deflection
II(X. 1) of the string. assumed in the fom1 II(X, 1) = y(x)II'(1). where t is time. (This model will be discussed in
great det:Jil in Secs. 12.2-12.4.)
Sollltion.
k2
=
12
From (I) and (2) we see that p = I, if = O. r = I in (I I. and a = O. b = 77. kl = 11 = I.
= 0 in (2). For negative A = - v 2 a general solution of the ODE in (3) is y(x) = ("levx + c2e -VX. From
the boundary conditions we obtain ("1 = ("2 = O. so that y == O. which is not an eigenfunction. For A = 0 the
situation is similar. For positive A = v 2 a general solution is
y(x) = A cos vx
-t-
B sin vx.
From the first boundary condition we obtain \'(0) = A = O. The second boundary condition then yields
)'(77) = B sin V77 = 0,
For v = 0 we have y
==
O. For A = v
2
thus
= 1. 4. 9. 16..... laking B
y(X) = sin vr
Hence the eigenvalues of the problem are A = v
= sin VX, where" = 1. 2. . . . .
y(x)
v = O. :!:I, :+:2,···.
2
,
= I. we obtain
(v =
1,2, .. ').
where v = I, 2, ... , and corresponding eigenfunctions are
•
Existence of Eigenvalues
Eigenvalues of a Sturm-Liouville problem (I), (2), even infinitely many, exist under rather
general conditions on p. q. r in (1). (Sufficient are the conditions in Theorem I, below,
together with p(x) > 0 and r(x) > 0 on a < x < b. Proofs are complicated; see Ref. LA3]
or [All] listed in App. 1.)
SEC. 5.7
205
Sturm-Liouville Problems. Orthogonal Functions
Reality of Eigenvalues
Furthermore, if p, q, r, and p' in (l) are real-valued and continuous on the interval
a ~ x ~ band r is positive throughout that interval (or negative throughout that interval).
then all the eigenvalues of the Sturm-Liouville problem (l), (2) are real. (Proof in
App. 4.) This is what the engineer would expect since eigenvalues are often related to
frequencies, energies. or other physical quantities that must be real.
Orthogonality
The most remarkable and important property of eigenfunctions of Sturm-Liouville problems
is their 0I1hogonality, which will be crucial in series developments in terms of eigenfunctions.
DEFINITION
Orthogonality
Functions Yl(X), Y2(X), ... defined on some interval a ~ x ~ b are called orthogonal
on this interval with respect to the weight function rex) > 0 if for all 111 and all n
different from tn,
I
(4)
b
rex) Ym(X) Yn(X) dx = 0
(m
*-
n).
a
II Ym II
The norm
of Ym is defined by
I
IIYmll
(5)
b
r(x)Ym2(x) dr: .
a
Nore thar this is the "quare roor of the integral in (4) with Il = Ill.
The functions .'"1> )"2, •.• are called orthonormal on a ~ x ~ b if they are
orthogonal on this interval and all have norm I.
If r{x) = I, we more briefly call the functions orthogonal instead of orthogonal
with respect to rex) = I; similarly for orthonormality. Then
I
b
Ym(X) Yn(X) dx =
0
*-
(m
IIYmll =
11),
a
E X AMP L E 3
I
b
2
Ym (x) dx.
a
Orthogonal Functions. Orthonormal Functions
The functions y",(x) = sin mx, m = I. 2.... form an orthogonal set on the interval m 1= 11 we obtain by integration [see (II) in App. A3. II
IT>' J'm(X)Yn(X) d,' = IT>' sin Ill\' sin In dx =
-77'
The norm
1I"
IiYmli
cos (111 - II)T tiT
-1 IT>'
-71
-r,
7T ;§; X ;§; 7T,
cos (/II
because for
+ II)X dx
=
O.
-17
equals V;. because
IIYmli 2 = I:"sin2 /11x dx = 7T
tm = 1,2,"
'J.
Hence the corre'ponding orthonormal set, obtained by division by the norm, is
sinx
v;,. ,
sin2x
v;,. ,
sin 3x
V; ,
•
CHAP. 5
206
Series Solutions of ODEs. Special Functions
Orthogonality of Eigenfunctions
Orthogonality of Eigenfunctions
THEOREM 1
Suppose that the functions p, q, r, and p' in the Snl17ll-Liouville equation (l) are
real-valued alld contilluous alld r(x) > 0 Oil the interval a ~ x ~ b. Let Ym(x) and
Yn(x) be eiRellfilllctiolls of the Sturm-Lioul'iIIe problem (I), (2) that correspond to
different eigelll'alues Am alld An' respectively. Theil .1'm, Yn are orthogollal on that
interval with respect to the weight jilllctioll r. that is.
J
b
(6)
r(x)ym(x)yn(x) dl:
=0
(111 =1= 11).
a
fr pea) = 0, thell (2a) can be dropped from the problem. fr pCb) = 0, thell (2b)
can be dropped. [It is then required that y and .1" remain bounded at such a point,
and the problem is called singular, as opposed to a regular problem in which (2)
is used.]
ljp(a) = pCb), thell (2) call be replaced by the "periodic boundary conditions"
(7)
yea)
= y(b),
)" (a)
= y' (b).
The boundary value problem consisting of the Sturm-Liouville equation (I) and the
periodic boundary conditions (7) is called a periodic Sturm-Liouville problem.
PROOF
By assumption. Ym and
)'n
satisfy the Sturm-Liouville equations
(PY;nJ'
+
(q
+ Amr)Ym = 0
(py;J'
+
(q
+ An r »)'" = 0
respectively. We multiply the first equation by .1'n, the second by -Ym' and add.
where the last equality can be readily verified by performing the indicated differentiation
of the last expression in brackets. This expression is continuous on a ~ x ~ b since p
and p' are continuous by assumption and Y>n' Yn are solutions of (I). Integrating over x
from a to b, we thus obtain
(8)
(a
<
b).
The expression on the right equals the sum of the subsequent Lines I and 2,
(9)
p(b)L,,:,(iJ)Ym(b) - y';"(b)Yn(b)]
(Line I)
-p(a)[y~(a)Ym(a) - Y:n(a)Yn(a)]
(Line 2).
Hence if (9) is zero. (8) with Am - An =1= 0 implies the orthogonality (6). Accordingly,
we have to show that (9) is zero, using the boundary conditions (2) as needed.
SEC. 5.7
207
Sturm-Liouville Problems. Orthogonal Functions
Case 1. pea)
Case 2. pea)
= pCb) = O. Clearly. (9) is zero, and (2) is not needed.
'* 0,
pCb)
= O. Line
I of (9) is zero. Consider Line 2. From (2a) we have
k2y~(a)
k1Yn(a)
+
k1Ym(a)
+ k 2Y;n(a) = O.
0,
=
Let k2 =F O. We multiply the first equation by Ym(a). the last by -y,,(a) and add.
k2[Y;t(a)Ym(a) - y,',.,(a))",,(a)]
= O.
This is k2 times Line 2 of (9), which thus is zero since k2 =F O. If k2
assumption, and the argument of proof is similar.
= O. then kl
=F 0 by
'*
'* O. We use both (2a) and (2b) and proceed as in Cases 2 and 3.
Case 3. pea) = O,p(b)
O. Line 2 of (9) is zero. From (2b) it follows that Line 1 of (9)
is zero; this is similar to Case 2.
Case 4. pea)
'*
Case 5. pea)
= pCb). Then (9) becomes
O,p(b)
The expression in brackets [.. '1 is zero, either by (2) used as before, or more directly by
(7). Hence in this case, (7) can be used instead of (2), as claimed. This completes the
proof of Theorem 1.
•
E X AMP L E 4
Application of Theorem 1. Vibrating Elastic String
The ODE in Example 2 is a Sturm~Liouville equation with p = 1. q = O. and r = I. From Theorem I it follows
•
that the eigenfunctions Yilt = sin m~ (111 = 1.2.... ) are orthogonal on the interval 0 ~ x ~ 7T.
E X AMP L E 5
Application of Theorem 1. Orthogonality of the Legendre Polynomials
Legendre's equation is a
Sturm~Liouville
equation (see Example I)
[ (I-x).'"
2
'l' +A.I'=O.
A = nen
+
I)
with I' = I - x 2 • q ~ 0, and r = I. Since p( -1) = p(l) = 0, we need no boundary conditions. but have a
~illglliar Sturm-Liouville problem on the interval -1 ~ x ~ 1. We know that for 11 = 0, I, ... , hence
A = O. I . 2, 1 . 3....• the Legendre polynomials P n(x) are solution, of the problem. Hence these are the
eigenfunctions. From Theorem I it follows that they are orthogonal on that interval. that is,
f
(10)
1
Pm(x)Pn(x) dx =
0
(Ill 1= n). •
~1
E X AMP L E 6
Application of Theorem 1. Orthogonality of the Bessel Functions In(x)
The Bessel function in(x) with fixed integer II
~
0 satisfies Bessel's equation (Sec. 5.5)
where j" = dJnldx. j~ = d 2 i,/dx2 . In Example 1 we transformed this equation. by setting
equation
x=
h. into a
Stuml~Liouville
with pIx) = ~,q(x) = -1l /x, rex) = x, and parameter A = k 2 . Since 1'(0) = O. Theorem 1 implies orthogonality
on an interval 0 ~ x ~ R (R given. fixed) of those solutions JnVer) that are zero at x = R. that is.
2
(11)
in(kR) = 0
(II fixed).
208
CHAP. 5
Series Solutions of ODEs. Special Functions
[Note that q(x) = -1l 2 /x is discontinuous at O. but this does not affect the proof of Theorem 1.1 It can be shown
(see Ref. [A 13]) that In(Ji) ha~ infinitely many zeros, say, = an.l < 0'n.2 < ... (see Fig. 107 in Sec. 5.5 for
II = 0 and I). Hence we must have
x
(12)
This
THEOREM 2
thus
kR = O'n,711
prove~
(Ill = 1,2."
').
the following orthogonality property.
Orthogonality of Bessel Functions
For each fixed nonnegative integer n the sequence of Bessel functions of the first
ki1ld In(k,,.lX), I n (kn.2 x), ... with ~.m as in (12) forms an orthogonal set on the
imell'al 0 ~ x ~ R with respect to the weight function r(x) = x, that is.
R
f xJn(kn,mx)Jn(kn,jx) dx
(13)
o
=
(j
0
*-
111, II
fixed).
Hence we have obtained illfinitely lIIallY orthogollal sets. each conesponding to one of the fixed values Il. This
also illustrates the importance of the zeros of the Bessel functIons.
•
E X AMP L E 7
Eigenvalues from Graphs
Solve the Sturm-Liouville problem y" + Ay = O.
.1'(0) + y' (0) = O.
y( 17) -
y' (17) =
o.
Solution. A general solution and its derivative are
y=Acosb:+Bsinb:
y'
and
The fust boundary condition gives yeO) +
and substitution of A ~ -B/.. give
= . Ak sin b:
+ Bk cos b:.
k=
VA.
y' (0) = A + Bk = O. hence A = - Bk. The second boundary condilion
)'(17) - y' (17) = A cos 17/... + B sin 17k + Ak sin 17k - Bk cos 17k
= -Bk
"*
We must have B
B cos 17k gives
cos 17k
+B
sin 17k - Bk 2 sin 17k - Bk cos 17k = D.
0 since otherwise B = A = O. hence y = O. which is not an eigenfunction. Division by
-k + tan 17k - k 2 tan 17k - k = 0,
thus
tan 17k =
-2k
k2 - I .
The graph in Fig. I IO now shows u, where to look for eigenvalues. These conespond to the k-values of the points
of intersection of tan 17k and the right side - 2k/(k 2 - I) of the last equation. The eigenvalues are Am = k",2,
where 1..0 = 0 with eigenfunction Yo = I and (he other An, are located near 22 , 32 ,42 , . . . , with eigenfunctions
cos k",x and sin k",x, 111 = 1,2, .... The precise numeric determination of the eigenvalues would require a
root-finding method (such as those given in Sec. 19.2).
•
y
1
1/:
Or-~-+--r-7--+~~+--r~--+---
k
-1
-2
-3
Fig. 110.
Example 7. Circles mark the intersections of tan 1Tk and - 2k/(e - 1)
SEC. 5]
209
Sturm-Liouville Problems. Orthogonal Functions
..........
_..... ---.
_....... ..........
-_ ..............
..-
1. (Proof of Theorem 1) CalTY out the details in Cases
3 and 4.
2. Normalization of eigenfunctions Ym of (I), (2) means
that we multiply Ym by a nonzero constant em such that
emYm has norm I. Show that 2m = ey", with any e
0
is an eigenfunction for the eigenvalue corresponding to
vm 3. (Change of x) Show that if the functions Yo(x), YI(X),
... form an orthogonal set on an interval a ~ x ~ b
(with rex) = I), then the functions )'o(er + k), )'l(er + k),
... , e > 0, form an orthogonal set on the interval
*
~ I ~
(a - k)/e
STURM-LIOUVILLE PROBLEMS
Write the given ODE in the form (I) if it is in a different
form. (Use Prob. 6.) Find the eigenvalues and eigenfunctions.
Verify orthogonality. (Show the details of your work.)
7. y" + Ay = o.
yeO) = 0, )'(5) = 0
8. y" + Ay = 0,
),'(0) = 0, y'(w) = 0
9. y" + Ay = 0,
yeO) = 0, )" (L) = 0
10. y" + Ay = 0,
yeo) = yO), y' (0) = y' 0)
11. y" + Ay = O.
yeO) = y(2w), y' (0) = y' (2rr)
12. .'v"
+
yO)
,
AV
. = O.
+
\,(0)
.,.
+ -v'(O)
=
o.
o. yO)
=
0, y(2)
=
O.
(a) Chebyshev pol)nomiaIs lO of the first and second
kind are defined by
Tn(x) = cos (n arccos x)
Un(x) =
respectively, where
+
(A
(Set x = e t .)
+ I)y = 0,
+
(A
+
16)y = 0,
+ I) arccos x]
~
sin [(n
11
=
O. 1, .. '. Shuw that
To = 1,
Uo
=
1,
Show that the Chebyshev polynomials Tn(x) are
orthogonal on the interval - I ~ x ~ I with respect to
the weight function rex) = 1I~. (Hint. To
evaluate the integral, set arccos x = e.) Verify that
T1l (x), 11 = 0, I, 2. 3, satisfy the Chebyshev equation
(1 -
x 2 )y" - xy'
+ n2 y
= O.
(b) Orthogonality on an infinite interval: Laguerre
polynomials l l are defined by Lo = I, and
0,
13. y" + Ay = 0,
yeO) =0, y(\)+y'(1)=O
14. (xY')' + Ax-Iy = 0,
yO) = 0, y'(e) = O.
(Set x = e t .)
15. (x-1y')' + (A + L)x- 3y = 0,
y(l) = O.
y(e"") =
=
20. TEAM PROJECT. Special Functions. Orthogonal
polynomials playa great role in applications. For this
reason, Legendre polynomials and various other
orthogonal polynomials have been studied extensively;
see Refs. [GR1], [GRIO] in App. 1. Consider some of
the most important ones as follows.
y (I) = 0
16. y" - 2/
yO) = 0
17. y" + 8y'
y( rr) = 0
+ (k 2 + 2x- 2 )y
(Use a CAS or set y = xu.)
(b - k)!c.
4. (Change of x) Using Prob. 3, derive the orthogonality
of I, cos wx, sin wx, cos 2wx. sin 2wx. ... on
-1 ~ x ~ I (r(x) = 1) from that of 1, cos x, sin x,
cos 2x, sin 2x, ... on -w ~ x ~ rr.
5. (Legendre polynomials) Show that the functions
P,,{cos 6), n = 0, I, ... , form an orthogonal set on
the interval 0 ~ e ~ rr with respect to the weight
function sin e.
6. (Tranformation to Sturm-Liom iIIe form) Show that
Y" + fy' + (g + Ah»)' = 0 takes the form (I) if you
set p = exp (If dx), q = pg, r = hp. Why would you
do such a transformation?
17-191
19. y" - 2x- 1 y'
n = 1,2,'"
Show that
yeO) = 0,
yeO)
18. xY" + 2y' + Axy = 0,
yew) = 0,
(Use a CAS or set y = x-1u.)
=
0,
n2rr) = O.
Prove that the Laguerre polynomials are orthogonal on
the positive axis 0 ~ x < :x; with respect to the weight
function /"p:) = e- x . Hint. Since the highest power in
Lm is x"', it suffices to show that f e-xxkLn dx = 0 for
k < n. Do this by k integrations by parts.
IOpAFNUTI CHEBYSHEV (1821-1894), Russian mathematician. is known for his work in approximation
theory and the theory of numbers. Another transliteration of the name is TCHEBICHEF.
llEDMOND LAGUERRE (1834-1886). French mathematician. who did research work in geometry and in
the theory of infinite series.
210
CHAP. 5
Series Solutions of ODEs. Special Functions
5.8 Orthogonal Eigenfunction Expansions
Orthogonal functions (obtained from Sturm-Liouville problems or otherwise) yield
important series developments of given functions. as we shall see. This includes the famous
FOllrier series (to which we devote Chaps. 11 and 12), the daily bread of the physicist and
engineer for solving problems in heat conduction. mechanical and electrical vibrations, etc.
Indeed, orthogonality is one of the most useful ideas ever introduced in applied mathematics.
Standard Notation for Orthogonality and Orthonormality
The integral (4) in Sec. 5.7 defining orthogonality is denoted by (Ym. Yn). This is standard.
Also. Kronecker's deJta 12 omn is defined by omn = 0 if 111 *- II and omn = 1 if 111 = 11
(thus on" = I). Hence for orthonormal functions Yo, ."1' Y2' ... with respect to weight
rex) (> 0) on [I ;:::; x ;:::; b we can now simply write (Ym' Yn) = om11,' written out
if
/11
*-
if
11l
= n.
11
(1)
Also. for the norm we can now write
lIyll = \I(y",. Ym)
(2)
=
f
b
2
r(x)Ym (x) d.r.
a
Write down a few examples of your own, to get used to this practical
~hort
notation.
Orthogonal Series
Now comes the instant that shows why orthogonality is a fundamental concept. Let
Yo, .1'1' .1'2• .•. be an orthogonal set with respect to weight r(x) on an interval [I ;:::; x;:::; b.
Let J(x) be a function that can be represented by a convergent series
x
J(x) = ~ [lmY",(x) = [loYo(x)
(3)
+
[lIYl(X)
+
m~O
This is called an orthogonal expansion or generalized Fourier series. If the Ym. are
eigenfunctions of a Sturm-Liouville problem. we call (3) an eigenfunction expansion. In
(3) we use again 111 for summation since 11 will be used as a fixed order of Bessel functions.
Given J(x). we have to determine the coefficients in (3), called the Fourier constants
of J(x) with re:,pect to Yo, )'1, . . . . Because of the orthogonality this is simple. All we have
to do is to multiply both sides of (3) by r(x)y,,(x) (nfixed) and then integrate on both sides
from a to b. We assume that term-by-term integration is permissible. (This is justified, for
instance, in the case of "uniform convergence," as is shown in Sec. 15.5.) Then we obtain
(J,
)'n)
=
f
a
b
rJ)'n
dr =
f r (x~
b
a
m=O
[I",Ym
)
.1'" dx
=
~
ex:
[lm(Y",., Yn)'
m~O
12LEOPOLD KRONECKER (1823-1891 l. German mathematician at Berlin University. who made important
to algebra. group theory. and number theory.
contribution~
SEC. 5.8
211
Orthogonal Eigenfunction Expansions
Because of the orthogonality all the integrals on the right are zero. except when
Hence the whole infinite series reduces to the single term
111
n.
Assuming that all the functions Yn have nonzero norm, we can divide by II.vn 112; writing
again 111 for n, to be in agreement with (3), we get the desired formula for the Fourier
constants
(f,
(4)
EXAMPLE 1
1
YIll)
Ily",11
2
f
b
(m
r(x)f(x)Ym(x) dx
0, 1, .. ').
a
Fourier Series
A mo,t important c\as, of eigenfunction expansions is obtained from the periodic Sturm-Liouville problem
y" + Ay = 0,
y'('iT)
A general solution of the ODE is y = A cos kx + B sin kx, where k
into the boundary conditions, we obtain
A cos k'iT
-kA sin k7T
+
+
B ,in k7T = A cos (-k'iT)
+
kB cos k'iT = -kA sin (-k'iT)
= y'(-'iT).
= VA. Substituting y and its derivative
B sin (-k7T)
+
kB cos (-k7T).
Since cos l-a) = cos a, the cosine terms cancel, so that these equations give no conditlOn for these terms. Since
sin (-a) = -sin a, thc equations gives the condition sin k7T = 0, hence k'iT = 1II'iT, k = 11/ = 0, 1,2, ... , so
that the eigenfunctions are
cos 0
=
1,
sin x,
cos X,
sin 2x, .. "
cos 2x,
cos
IIIX,
sinlllx, ...
corresponding pairwise to the eigenvalues A = k 2 = 0, 1,4, ... , m 2 , . . . . lsin 0 = 0 is not an eigenfunction.)
By Theorem I in Sec. 5.7, any two of these belonging to different eigenvalues are orthogonal on the interval
-7T ~ X ~ 7T (note that rex) = 1 for the present ODE). The orthogonality of cos I1lX and sin 111X for the same
111 follows by integration.
III I
I,
For the 1l0mlS we get
= \1'2;, and v:;;: for all the others, as you may verify by integrating
cos2 'y,
2
sin x. etc .. from -'iTto 'iT. This gives the series (with a slight extension of notation sincc we have two functions
for each eigenvalue I, 4, 9, ... )
(5)
f(x) = ao
+
L
(am cos IIIX
+ b m sin fI1x).
1#l.=1
According to (4) the coefficients (with
111 =
1,2, ... ) are
b1ll. =
(6)
..!..
7T
f'"
J(x) sin 111X £lx.
-71
The series (5) is called the Fourier series of f(x). Its coefficients are called the Fourier coefficients of f(x),
as given by the so-called Euler formulas l6) lnot to be confused with the Euler formula (11) in Sec. 2.2).
For instance, for the "periodic rectangular wave" in Fig. III, given by
f(x) =
{
-I
1
if
-7T<X<O
if
O<x<'iT
and
f(x
+
27T) = ./(x),
212
CHAP. 5
Series Solutions of ODEs. Special Functions
°
we get from (6) the values ao =
and
[I~,,-(-I)COSIIIXdX+ L"-I'COS11lXdT] =0,
7r
bm =
[I~,,-(-I)SinIllXdT+ Io"-l,sinIllXdT]
7r
4f( mil)
7Tln
ifm
°
[I - 2 cos 1117r + I] = {
= 1.3.···.
if m = 2,4···.
Hence the Fourier senes of the periodic rectangular wave is
f (x)
=
~
7r
~
~
1C
0
1C
x
21C
-~-l ~
Fig. 111.
•
(Sin x + 3 sin 3x + 5 sin 5,
- + ... ) .
Periodic rectangular wave in Example 1
Fourier series are by far the most important eigenfunction expansions. so important to
the engineer that we shall devote two chapters (11 and 12) to them and their applications,
and discuss numerous examples.
Did it surprise you that a series of continuous functions (sine functions) can represent
a discontinuous function? More on this in Chap. 11.
E X AMP L E 2
Fourier-Legendre Series
A Fourier-Legendre series is an eigenfunction expansion
.f(x)
=
L
a'mP'm(x)
=
aoPo
+ a1 P1(x) + (l2 P 2(x) + - - - = ao + a1x + a2(ix2 -
i) -'- -
m=O
in terms of Legendre polynomials (Sec_ 5.3). The latter are the eigenfunctions of the Sturm-Liouville problem
in Example 5 of Sec. 5.7 on the interval -I ~ x ~ I. We have rex) = I for Legendre'S equation, and (4) gives
2111+ I
2
(7)
becau~e
(8)
am = - - -
I
1
f(x)Pm(x) d>:.
111
= 0, I,'"
-1
the norm is
I
1
-1
Pm(x)2
dx =
1_2_
+ 1
V
(Ill = 0, I, ... )
2111
as we state without proof (The proof is tricky; it uses Rodrigues's formula in Problem Set 5 3 and a reduction
of the resulting integral to a quotient of gamma functions_)
SEC. 5.8
213
Orthogonal Eigenfunction Expansions
For instance, let j(x)
=
sin 7TX. Then we obtain the coefficients
1
1
2nz+II (sin 7TX) P'In(x) dx,
a'ln = --2--
3
2
a1 =
thus
-1
J
x sin TTX tIT: =
3 = 0.95493,
-1
etc.
7T
Hence the Fourier-Legendre series of sin TTX is
sin TTX = 0.95493P 1 (x) - 1.15824P3(x)
+ 0.21429P 5 (x)
- 0.OJ664P7 (x)
+ 0.00068P9 (x)
- 0.OO002P l l(x)
+ ....
The coefficient of P 13 is about 3 . 10- 7 . The sum of the first three nonzero terms gives a curve that practically
coincides with the sine curve. Can you see why the even-numbered coefficients are 7ero? Why a3 is the absolutely
biggest coefficient?
•
E X AMP L E 3
Fourier-Bessel Series
In Example 6 of Sec. 5.7 we obtained infinitely many orthogonal sets of Bessel functions, one for each of Jo,
ft, J2 , • • • . Each set is orthogonal on an interval 0 ~ x ~ R with a fixed positive R of our choice and with
respect to the weight x. The orthogonal set for I n is In(kn,lX), In(kn,2X), In(kn,3x), ... , where n is fixed and
kn,'In is given in (12), Sec. 5.7. The corresponding Fourier-Bessel series is
(9)
f(x)
=
L
a",Jn(kn,'ln x )
=
a1Jn(kn ,1 x)
+ a2Jn(kn,2X) + a3Jn(kll,3X) +
(n
fixed).
1n=1
The coefficients are (with O'n,m = kn,mR)
a'In =
(10)
m
=
1,2,"
because the square of the norm is
(11)
as we state without proof (which is tricky; see the discussion beginning on p. 576 of [A13]).
For instance, let us consider f(x) = I - x 2 and take R = I and n = 0 in the series (9), simply writing A for
0'0,'In' Then kn,m = O'O,m = A = 2.405, 5.520, 8.654, 11.792, etc. luse a CAS or Table Al in App. 5). Next we
calculate the coefficients ~ by (10),
am =
2
J 1 (A)
-2--
II
2
x(l - x )Jo(Ax) dx.
0
This can be integrated by a CAS or by formulas as follows. First use [xftlAx)], = AxJolAx) from Theorem 3
in Sec. 5.5 and then integration by parts,
am
=
2
-2J 1 (Al
I
1
2
x(1 - x )Jo(Ax) dx
2
=
-2-
[
ft (A)
0
I
-
A
2
(l - x )xft(Ax)
11 0
The integral-free part is zero. The remaining integral can be evaluated by [x2lz(Ax)
3 in Sec. 5.5. This gives
I
A
l'
=
I
1
xft(Ax)(-2x) d, ]
0
Ax2ft (Ax) from Theorem
a'In =
(A = 0'0,,,,)'
Numenc values can be obtained from a CAS (or from the table on p. 409 of Ref. [GRI] in App. I, together
with the formula J2 = 2,-Ift - Jo in Theorem 3 of Sec. 5.5). This gives the eigenfunction expansion of
I - x 2 in terms of Bessel functions J0, that is,
I - x
2
= 1.1081Jo(2.405x) - 0.1398Jo(5.520x)
2
+ 0.0455Jo(8.654x)
0.02IOJo(11.792,)
+ ....
A graph would show that the curve of I - x and that of the sum of the first three terms practically coincide. •
214
CHAP. 5
Series Solutions of ODEs. Special Functions
Mean Square Convergence.
Completeness of Orthonormal Sets
The remaining part of this section will give an introduction to a convergence suitable in
connection with orthogonal series and quite different from the convergence used in
calculus for Taylor series.
In practice, one uses only orthonormal sets that consist of "sufficiently many" functions,
so that one can represent large classes of functions by a generalized FOUlier series (3)certainly all continuous functions on an interval a ~ x ~ b, but also functions that do "not
have too many" discontinuities (see Example 1). Such orthonormal sets are called "complete"
(in the set of functions considered; definition below). For instance, the orthonormal sets
corresponding to Examples 1-3 are complete in the set of functions continuous on the
intervals considered (or even in more general sets of functions; see Ref. [OR7], Secs. 3.4-3.7,
listed in App. 1. where "complete sets" bear the more modem name "total sets").
In this connection, convergence is convergence in the norm, also called mean-square
convergence; that is. a sequence of functions fk is called convergent tvith the limit f if
(12*)
lim
k~x
Ilfk - fll
= 0;
written out by (2) (where we can drop the square root, as this does not affect the limit)
lim
(12)
I
b
r(x)[fk(X) - f(X)]2 dx
k_x a
Accordingly, the series (3) converges and represents
(13)
I
lim
= O.
f if
b
k_x a
r(x)[Sk(X) - f(X)]2 dx
= 0
where Sk is the hh partial sum of (3),
k
(14)
sk(x)
=
L
(lmYm(x).
1n=0
By definition, an 0l1honormal set )'0' YI' . . . on an interval {l ~ x ~ b is complete ill
set of fimctiolls S defined on (/ ~ x ~ b if we can approximate every f belonging to S
arbitrarily closely by a linear combination ao)'o + al)'1 + ... + akYk, that is, technically,
if for every E > 0 we can find constants (/0, . . . , {lk (with k large enough) such that
{l
(15)
An interesting and basic consequence of the integral in (13) is obtained as follows.
Performing the square and using (14), we first have
I
b
r(x)[Sk(X) - f(X)]2
dr
=
a
I
b
a
=
I
b
a
I
J2
b
2
rS k d-r -
[
I'
k
L
1n.=O
2
amY",
rfSk d-r
+
a
I
b
rf2 dr
a
k
dx - 2
L
Ul,=o
!
am
I
b
a
rfYm dx
+
I
b
rp dx.
a
The, f~rst in~egral on the right equals L a",2 ?ecause rYmYz dr = 0 for III =1= I, and
Ir)m dx - 1. In the second sum on the fIght, the mtegral equals am' by (4) with
SEC. 5.B
Orthogonal Eigenfunction Expansions
215
II Ym II 2
= 1. Hence the first term on the right cancels half of the second term, so that the
right side reduces to
k
- 'L
am
2
+
I
b
rp dx.
a
1lZ.=O
This is nonnegative because in the previous formula the integrand on the left is nonnegative
(recall that the \Veight r(x) is positive!) and so is the integral on the left. This proves the
important Bessel's inequality
11/112 =
(16)
I
b
(k
r(x)/(x)2 dx
= 1,2, .. ').
a
Here we can let k ---7 YJ, because the left sides foml a monotone increasing sequence that
is bounded by the right side, so that we have convergence by the familiar Theorem 1 in
App. A3.3. Hence
(17)
m=O
Furthermore, if )'0, )'1 • . . . is complete in a set of function~ S. then (13) holds for every
By (15) this implies equality in (16) with k ---7 :>C. Hence in the case of
completeness every 1 in S satisfies the so-called Parseval's equality
1 belonging to S.
II 1 112 =
(18)
I
b
r(x)/(x)2 dr.
a
As a consequence of (18) we prove that in the case of complete1less there is no function
orthogonal to every function of the orthonormal set. with the trivial exception of a function
of zero norm:
THEOREM 1
Completeness
Let Yo, )'1 •... be a complete ort/101101711al set on a ~ x ~ b in (I set offunctions S.
Then if a .function 1 belongs to S alld is orthogonal to every )'m, it must have norm
zero. In particular, if 1 is continuous, then 1 must be identical!.v zero.
PROOF
E X AMP L E 4
Since 1 is orthogonal to every )'m' the left side of (18) must be zero. If 1 is continuous,
then II 1 II = 0 implies I(x) == 0, as can be seen directly from (2) with 1 instead of )'m
because r(x) > O.
•
Fourier Series
The orthonormal set in Example I is complete in the set of continuous functions on -7T~ x
that fIx) == 0 is the only continuous function orthogonal to all the functions of that set.
Solutioll.
Lef f be any continuou~ function. By the orthogonality (we can unlit
J"
~ 7T.
Vh and \
J(x) sin IIIX d,"
Verify directly
S),
~ O.
-r.
Hence am ~ 0 and b m ~ 0 in (6) for all III. su thaI (3) reduces to J(X)
==
O.
•
CHAP. 5
216
Series Solutions of ODEs. Special Functions
This is the end of Chap. 5 on the power series method and the Frobenius method, which
are indispensable in solving linear ODEs with variable coefficients, some of the most
important of which we have discussed and solved. We have also seen that the latter are
important sources of special functions having orthogonality properties that make them
suitable for orthogonal series representations of given functions.
11-41
-
:
FOURIER-LEGENDRE SERIES
Showing the details of your calculations, develop:
1. 7x 4 - 6x 2
2. (x + 1)2
5. Prove that if f(x) in Example 2 is even [that is,
f(x) = f( - x)], its series contains only P m(x) with
even 111.
16-161
CAS EXPERIMENTS. FOURIER-LEGENDRE
SERIES
Find and graph (on common axes) the partial sums up to
that S"'o whose graph practically coincides with that of/ex)
within graphical accuracy. State what 1Il0 is. On what does
the size of IIlO seem to depend?
6. f(x)
sin TTX
7. f(x)
sin 27TX
8. f(x)
cos TTX
9. f(x)
cos 27TX
10. f(x)
cos 37TX
n. f(x) eX
on the speed of convergence by observing the decrease
of the coefficients.
{c) Take f(x) = 1 in (19) and evaluate the integrals
for the coefficients analytically by (24a), Sec. 5.5, with
JJ = I. Graph the first few partial sums on common
axes.
18. TEAM PROJECT. Orthogonality on the Entire
Real Axis. Hermite Polynomials. 13 These orthogonal
polynomials are defined by HeoO) = 1 and
REMARK. As is true for many special functions, the
literature contains more than one notation, and one
sometimes defines as Hermite polynomials the
functions
H *(x)
x2
12. f(x)
e13. f(x) = 1/(1 + x 2)
14. f(x) = 10(aO,1x). where aO,1 is the first positive zero
of 10
15. f(x) = 10(aO,2x), where aO,2 is the second positive
zero of 10
16. f(x) = 11ta1,1x), where a1,1 is the first positive zero
of 11
17. CAS EXPERIMENT. Fourier-Bessel Series. Use
Example 3 and again take n = 10 and R = 1. so that
you get the series
(19)
flx) = al10lao,lX)
+
a210(aO,2x)
+
a310lao,3x)
+ ...
with the zeros 0:0,1 0:0,2'
Table AI in App. 5).
.•.
from your CAS (see also
(a) Graph the terms 10(aO,lx), ... , 10(aO,lOx) for
~ l on common axes.
o~ x
(b) Write a program for calculating partial sums of
(9). Find out for what f(x) your CAS can evaluate the
integrals. Take two such f(x) and comment empirically
n
2
d ne
_X2
= (-l)ne" - - dxn
This differs from our definition, which is preferred in
applications.
(a) Small Values of n. Show that
He1(X) = x,
He3(X) = x 3 - 3x,
He2(X) = x 2 - 1,
He4(X) = X4 - 6x 2
+ 3.
(b) Generating Function. A generating function of
the Hermite polynomials is
-n
(20)
etx-t2/2 =
L
anlx)f n
n=O
because He.,(x) = n!anlx), Prove this. Hint: Use the
formula for the coefficients of a Maclaurin series and
note that tx - ~f2 = ~X2 - ~(x - t)2.
(C) Derivative. Differentiating the generating function
with respect to x, show that
(21)
13CHARLES HERMITE (1822-1901), French mathematician, is known for his work in algebm and number
theory. The great HENRI POINCARE 11854-1912) was one of his students,
217
Chapter 5 Review Questions and Problems
(d) Orthogonality on the x-Axis needs a weight
function that goes to zero sufficiently fast as x ---? ::,:::cc.
(Why?) Show that the Hermite polynomials are
orthogonal on -<Xl < X < cc with respect to the weight
function rex) = e- x2/2 . Hint. Use integration by parts
and (21).
(e) ODEs. Show that
(22)
He~(x) = xHen(x) - Hen+l(x).
Using this with n - 1 instead of nand (21), show that
y = Hen(x) satisfies the ODE
(23)
y" - xy'
Show that w
equation14
=
(24)
w"
+
O.
=
e- X2/4y is a solution of Weber's
+~-
(n
+ ny
!x 2 )w = 0
(n=O,l,···).
19. WRITING PROJECT. Orthogonality. Write a short
report (2-3 pages) about the most important ideas and
facts related to orthogonality and orthogonal series and
their applications.
TIONS AND PROBLEMS
1. What is a power series? Can it contain negative or
fractional powers? How would you test for convergence?
18. (x 2 -
2. Why could we use the power series method for
Legendre's equation but needed the Frobenius method
for Bessel's equation?
3. Why did we introduce twO kinds of Bessel functions,
J and Y?
4. What is the hypergeometric equation and why did Gauss
introduce it?
5. List the three cases of the Frobenius method, giving
examples of your own.
6. What is the difference between an initial value problem
and a boundary value problem?
7. What does orthogonality of functions mean and how is
it used in series expansions? Give examples.
8. What is the Sturm-Liouville theory and its practical
importance?
9. What do you remember about the orthogonality of the
Legendre polynomials? Of Bessel functions?
10. What is completeness of orthogonal sets? Why is it
important?
20. x 2y"
+
~1-251
BESSEL'S EQUATION
[I ~-~
SERIES SOLUTIONS
Find a basis of solutions. Try to identify the series as
expansions of known functions. (Show the details of your
work.)
11. y" - 9y = 0
12. (1 - X)2y" + (1 - x)y' - 3y = 0
13. xy" - (x + l)y' + y = 0
14. x 2 y" - 3xy' + 4y = 0
15. y" + 4xy' + (4x 2 + 2)y = 0
16. x 2 y" - 4xy' + (x 2 + 6)y = 0
17. xy"
+ (2x + I)Y' + (x + l)y
19. (x
2
0
l)y" +
-
+
xy'
(4x 4
0
0
=
=
1)y = 0
-
Find a general solution in terms of Bessel
the indicated transformations and show the
21. x 2y" + xy' ., (36x 2 - 2))" = 0
22. x 2y" + 5xy' + (x 2 - 12)y = 0
23. x 2y" + xy' + 4(x 4 - l)y = 0
functions. (Use
details.)
(6x = z)
(y = u/x 2)
(x 2 = z)
24. 4x 2y" - 20xy' + (4x 2 + 35)y = 0
25. y" + k 2 x 2y = 0
(y = uVx, ~kX2
y' (0)
=
z)
30. y"
0,
+ Ay
131-~
=
0,
y(l)
+ xy' + (Ax 2 - J)y
=
0
y' (1)
=
28. (xy')' + Ax- 1 y
(Set x = e t .)
29. x 2y"
yeO)
x 3 u)
(y =
126-301 BOUNDARY VALUE PROBLEMS
Find the eigenvalues and eigenfunctions.
26. y" + Ay = 0,
yeO) = 0,
y' (7T)
27. y" + Ay = 0,
yeO) = y(I).
=
y(1)
0,
=
=
0,
=
y(e)
=
O.
0,
0
yeO)
+ y'(O)
=
0,
y(27T)
=
0
CAS PROBLEMS
Write a program, develop in a Fourier-Legendre series, and
graph the first five partial sums on common axes, together
with the given function. Comment on accuracy.
31. e 2x ( - 1 ~ x ~ I)
32. sin ( 7TX 2) ( - 1 ~ x ~ I)
33. 11(1
=
+ 2y
4xy' + 2y
l)y" - 2xy'
+
Ixl)
(-1 ~ x ~ 1)
34. Icos 7Txl (-1 ~ x ~ 1)
35. x if 0 ~ x ~ 1,0 if -1
14HEINRICH WEBER (1842-1913), German mathematician.
~
x < 0
218
CHAP. 5
Series Solutions of ODEs. Special Functions
Series Solution of ODEs. Special Functions
The power series method gives solutions of linear ODEs
y"
(1)
+ p(.'l)y' + q(.'l)Y
= 0
with variable coefficients p and q in the form of a power series (with any center
e.g., .'lo = 0)
.'lo,
(2)
Y(.'l)
=
L
am(.'l -
.'lo)m
=
(10
+
(l1(.'l -
.'lo)
+
(l2(X -
XO)2
+ ....
In=O
Such a solution is obtained by substituting (2) and its derivatives into (I). This gives
a recurrence formula for the coefficients. You may program this formula (or even
obtain and graph the whole solution) on your CAS.
If p and q are analytic at .'lo (that is. representable by a power series in powers
of x - .'lo with positive radius of convergence: Sec. 5.2). then (I) has solutions of
this form (2). The same holds if h. p. q in
h(x»)""
+ p(.'l)y' +
q(.'l)Y
= 0
are analytic at .'lo and h(.'lo) oF O. so that we can divide by h and obtain the standard
form 0). Legendre's equation is solved by the power series method in Sec. 5.3.
The Frobenius method (Sec. 5.4) extends the power selies method to ODEs
y"
(3)
(I(X),
+ - - - \" +
X -
.'lo .
b(x)
(x -
\"
xO)2 .
= 0
whose coefficients are singular (i.e., not analytic) at xo. but are "not too bad,"
namely, such that a and b are analytic at Xo. Then (3) has at least one solution of
the form
(4) y(x)
=
(x -
xor L
111,
(I",(x -
.'loY'" =
(lo(x -
xor
+ (/l(X
- XO)'"+I
+ ...
0
where r can be any real (or even complex) number and is determined by substituting
(4) into (3) from the indicial equation (Sec. 5.4), along with the coefficients of (4).
A second linearly independent solution of (3) may be of a similar form (with different
rand (l11/S) or may involve a logarithmic term. Bessel's equation is solved by the
Frobenius method in Secs. 5.5 and 5.6.
"Special functions" is a common name for higher functions. as opposed to the
usual functions of calculus. Most of them arise either as nonelementary integrals
fsee (24)-(44) in App. 3.11 or as solutions of (1) or (3). They get a name and notation
and are included in the usual CASs if they are important in application or in theory.
Summary of Chapter 5
219
Of this kind, and particularly useful to the engineer and physicist, are Legendre's
equation and polynomials Po, PH ... (Sec. 5.3), Gauss's hypergeometric
equation and functions F(a, b, c; x) (Sec. 5.4), and Bessel's equation and
functions J v and Y v (Secs. 5.5, 5.6).
Modeling involving ODEs usually leads to initial value problems (Chaps. 1-3)
or boundary value problems. Many of the latter can be written in the form of
Sturm-Liouville problems (Sec. 5.7). These are eigenvalue problems involving
a parameter A that is often related to frequencies, energies, or other physical
quantities. Solutions of Sturm-Liouville problems, called eigenfunctions, have
many general properties in common, notably the highly important orthogonality
(Sec. 5.7), which is useful in eigenfunction expansions (Sec. 5.8) in terms of cosine
and sine (··Fourier series", the topic of Chap. 11), Legendre polynomials, Bessel
functions (Sec. 5.8), and other eigenfunctions.
••••
CHAPTER
6
Laplace Transforms
The Laplace transform method is a powerful method for solving linear ODEs and
corresponding initial value problems, as well as system~ of ODEs arising in engineering.
The process of solution consists of three steps (see Fig. 112).
Step 1. The given ODE is transformed into an algebraic equation ("subsidiary
equation").
Step 2. The subsidiary equation is solved by purely algebraic manipulations.
Step 3. The solution in Step 2 is transformed back, resulting in the solution of the given
problem.
IVP
IInitial Value f---~
Problem
Fig. 112.
1---
Solution
of the
®
IVI"
[
I
~
Solving an IVP by Laplace transforms
Thus solving an ODE is reduced to an algebraic problem (plus tho~e transformations).
This switching from calculus to algebra is called operational calculus. The Laplace
transform method is the most important operational method to the engineer. This method
has two main advantages over the usual methods of Chaps. 1-4:
A. Problems are solved more directly, initial value problems without first determining
a general solution. and nonhomogeneous ODEs without first solving the corresponding
homogeneous ODE.
B. More importantly, [he use of the unit step function (Heaviside function in
Sec. 6.3) and Dirac's delta (in Sec. 6.4) make the method particularly powerful for
problems with inputs (driving forces) that have discontinuities or represent short impulses
or complicated periodic functions.
In this chapter we consider the Laplace transform and its application to engineering
problems involving ODEs. PDEs will be solved by the Laplace transform in Sec. 12.11.
General formulas are listed in Sec. 6.8, transforms and inverses in Sec. 6.9. The
usual CASs can handle most Laplace transforms.
Prerequisite: Chap. 2
Sections that lIlay be omitted in a shorter course: 6.5, 6.7
References and Answers to Problems: App. 1 Part A, App. 2.
220
SEC. 6.1
6.1
Laplace Transform. Inverse Transform. Linearity. s-Shifting
221
Laplace Transform. Inverse Transform.
Linearity. s-Shifting
If f(t) is a function defined for all t ~ 0, its Laplace transform l is the integral of f(t)
times e- st from t = 0 to x. It is a function of s, say, F(s), and is denoted by ;£(f); thus
F(s) = 9::(f) = fCe-stf(t) dt.
(1)
o
Here we must assume that f(t) is such that the integral exists (that is, has some finite
value). This assumption is usually satisfied in applications-we shall discuss this near the
end of the section.
Not only is the result F(s) called the Laplace transform, but the operation just described,
which yields F(s) from a given f(t), is also called the Laplace transform. It is an "integral
transform"
F(s) =
fC
k(s, t)f(t) dt
o
with '·kernel" k(s, t) = e- st .
Furthermore, the given function f(t) in (1) is called the inverse transform of F(s) and
is denoted by 9::- I (F); that is, we shall write
f(t)
(1*)
= 9::- l (F).
Note that (1) and (1 *) together imply 9::-\9::(f) = f and 9::(9::- l (F» = F.
Notation
Original functions depend on t and their transforms on s-keep this in mind! Original
functions are denoted by lowercase letters and their transforms by the same letters in
capital, so that F(s) denotes the transform of f(t), and Y(s) denotes the transform of y(t),
and so on.
E X AMP L E 1
Laplace Transform
Let l(t) = 1 when t
Solution.
~
O. Find F(s).
From (1) we obtain by integration
5£(f) = 5£(1) =
LOG e- st dt =
o
_
~s e- st I""
0
s
(s> 0).
IPIERRE SIMON MARQUIS DE LAPLACE (1749-1827), great French mathematician, was a professor in
Paris. He developed the foundation of potential theory and made important contributions to celestial mechanics,
astronomy in general, special functions. and probability theory. Napoleon Bonaparte was his student for a year.
For Laplace's interesting political involvements. see Ref. [GR2], listed in App. I.
The powerful practical Laplace transform techniques were developed over a century later by the English
electrical engineer OLIVER HEAVISIDE (1850-1925) and were often called "Heaviside calculus."
We shall drop variables when this simplifies formulas without causing confusion. For instance, in (1) we
wrote 5£(f) instead of 5£(f)(s) and in (I *) 5£-l(F) instead of 5£-\F)(t).
CHAP. 6
222
Laplace Transforms
Our notation is convenient, but we should say a word about it. The interval of integration in (1) is infinite.
Such an integral is called an improper integral and, by definition, is evaluated according to the rule
oc
e -stfU) dt = Jim
(
1~cx:;
Jo
f
T
e -Si.f(t) dt.
0
Hence our convenient notation means
(s> 0) .
•
We shall use thi, notation throughout this chapter.
E X AMP L E 2
Laplace Transform .:£(eat ) of the Exponential Function eat
Let f(t)
=
eat when t :0;; 0, where a is a constant. Find :.f(f).
Solution.
Again by (1),
hence, when s - a > 0,
•
Must we go on in this fashion and obtain the transform of one function after another
directly from the definition? The answer is no. And the reason is that new transforms can
be found from known ones by the use of the many general properties of the Laplace
transform. Above all, the Laplace transform is a "linear operation," just as differentiation
and integration. By this we mean the following.
THEOREM 1
Linearity of the Laplace Transform
The Laplace transform is a linear operation; that is, for anyfunctions f(t) a17d g(t) whose
transjol7lls exist and any constants a and b the tran,~for11l of afft) + bg(t) exists, and
.:£{af(t)
PROOF
a.:£{f(t)}
+
b.:£{g(t)}.
By the definition in (1),
.:£{af(t)
+ bg(t)} =
L=e-st[af(t)
o
=a
E X AMP L E 3
+ bgU)} =
+ bg(t)l dt
f'e-stf(t) dt
o
+ b ICCe-stg(t) dt
=
a.:£{f(t)}
+ b.:£{gU)}.
•
0
Application of Theorem 1: Hyperbolic Functions
Find the transforms of cosh at and sinh aI.
Solution.
Since cosh at
~(eat
=
+
e -at) and sinh at = ~(eat -
e -at), we obtain from Example 2 and
Theorem 1
1
. =1
.'£(coshat) = -2 (:£(e at )
!f(smhat)
+
-(!f(e a t ) - !f(e-at»
2
I( I + -+1)
- = -1( - - _ _1)
_
+
5£'(e- at» =
-
2
--
s - a
s
a
J
2
s - a
s
a
-
s -2
s2 - a
a_
= __
.1'2 - a 2 •
•
SEC. 6.1
Laplace Transform. Inverse Transform. Linearity. s-Shifting
EXAMPLE 4
223
Cosine and Sine
Derive the formulas
.:f(cos wt)
s
=
s2
w
9'(sin wt) =
+ w2 '
2
S
+
2'
w
Solution by Calculus. We wlite Lc = !£(cos wt) and Ls = .'t:(sin wt). Integrating by pmts and noting that the
integral-free parts give no contribution from the upper limit 00. we obtain
e~:t coswtl~-
Lc= lCOe-stcoswtdt=
Ls
=
fooo e -st sin wt dt =
-; lCOe-stsin wtdt =
~
e -st sin wtl = +
-s
0
s
(= e -st cos wt dt =
Jo
s
~ Lc'
s
By substituting Ls into the formula for Lc on the light and then by substituting Lc into the formula for Ls on
the right. we obtain
L
=
C
L
s =
..!..s -
~s(!.'!..L)
se'
~S (..!..S - ~L)
s
S
s
w2)
1
Lc ( 1+2 = - ,
s
s
w
'
Solution by Transforms Using Derivatives.
S
2
Ls
'
See next section.
In Example 2, if we set a = iw with i =
Soilltion by Complex Methods.
w
=
\1=1,
we obtain
s + iw
s
w
. t I s + iw
:i(e'W ) = - - - = - - - - - - = - - - = - - - + i - - s2 + w2
s2 + w 2
,,2 + w2
< - iw
(s - iw)(s + iw)
Now by Theorem I and e iwt
=
.
cos wt + i sin wt [see (11) in Sec. 2.2 with wt instead of t] we have
5£(eiwt )
=
5£(cos wt + i sin wt)
=
':ECcos wt) t- i.'£(sin wt).
If we equate the real and imaginary parts of this and the previous equation, the result follows. (This formal
calculation can be justified in the theory of complex integration.)
•
Basic transforms are listed in Table 6.1. We shall see that from these almost all the others
can be obtained by the use of the general propelties of the Laplace transform. Formulas
1-3 are special cases of formula 4, which is proved by induction. Indeed, it is true for
n = 0 because of Example 1 and O! = 1. We make the induction hypothesis that it holds
for any integer n ~ 0 and then get it for /1 + 1 directly from (1). Indeed, integration by
parts first gives
Now the integral-free part is zero and the lasl part is (n
and the induction hypothesis,
n
+ 1
s
This proves formula 4.
n
+
s
1
n!
+
1)/s times :£(tn). From this
(n
+ I)!
224
CHAP. 6
Table 6.1
I
Laplace Transforms
Some Functions f(t) and Their Laplace Transforms ~(f)
f(t)
~(f)
1
1
lis
7
cos wt
2
t
lIs2
8
sin wt
3
t2
2 !Is 3
9
cosh ar
s
S2 - a 2
4
tn
(n = 0, 1, ... )
sn+l
10
sinh at
a
S2 - a 2
5
ta
(a positive)
sa+l
11
eat cos wt
s-a
(s - a)2 + uJ
6
eat
L
-s-a
12
eat sin wt
w
(s - a)2
~(f)
J(t)
n!
rCa + 1)
I
s
S2
+ w2
W
S2
+ uJ
+ uJ
I
f(a + I) in formula 5 is the so-called gamma function [(15) in Sec. 5.5 or (24) in
App. A3.1]. We get formula 5 from (1), setting st = x:
where s > O. The last integral is precisely that defining f(a + 1), so we have
f(a + I)/sa+t, as claimed. (CAUTION! f(a + 1) has x a in the integral, not x a + 1 .)
Note the formula 4 also follows from 5 because f(n + I) = n! for integer n ;::::; O.
Formulas 6-10 were proved in Examples 2-4. Fonllulas 11 and 12 will follow from 7
and 8 by "shifting," to which we tum next.
s-Shifting: Replacing 5
by 5
-
a in the Transform
The Laplace transform has the very useful property that if we know the transform of f(t),
we can immediately get that of eatf(t), as follows.
THEOREM 2
First Shifting Theorem, s-Shifting
Iff(f) has the transfonn F(s) (where s > kfor some k), thell eatf(t) has the transform
F(s - a) (where s - a > k). In fonnuias,
or, (f we take the inverse on both sides,
SEC. 6.1
225
Laplace Transform. Inverse Transform. Linearity. s-Shifting
PROOF
We obtain F(s - a) by replacing s with s - a in the integral in (1), so that
F(s - a) = {'e-Cs-a)tf(t) dt = ["e-st[eatf(t)] dt = .:£{eatf(t)}.
o
0
If F(s) exists (i.e., is finite) for s greater than some k, then our first integral exists for
s - a > k. Now take the inverse on both sides of this formula to obtain the second formula
•
in the theorem. ( ~AUTION! -a in F(s - a) but +a in eatf(t).)
E X AMP L E 5
s-Shifting: Damped Vibrations. Completing the Square
From Example 4 and the first shifting theorem we immediately obtain formulas II and 12 in Table 6.1,
~{eat
s-a
cos wt} = ------;;------;;(s
a)2 + u} ,
For instance, use these formulas to find the inverse of the transform
3s - 137
5£(f)
Solution.
I
=
s
2
+ 2s + 401
.
Applying the inverse transform. using its linearity (Prob. 28). and completing the square. we obtain
= ~-1{ 3(s + 1) - 140} = 3~-1{
(s + 1)2 + 400
(s
s + I
} _ 7:J:- 1 {
+ 1)2 + 202
-
(s
+
20
}.
1)2 + 202
We now see that the inverse of the right side is the damped vibration (Fig. 113)
I(t)
6
4
=
e -t(3 cos 20 t - 7 sin 20 t).
•
~
2
A
0
05
.0
-2
-4 '
-6
Fig. 113.
Vibrations in Example 5
Existence and Uniqueness of Laplace Transforms
This is not a big practical problem because in most cases we can check the solution of
an ODE without too much trouble. Nevertheless we should be aware of some basic facts.
A function f(t) has a Laplace transform if it does not grow too fast, say, if for all
t ~ 0 and some constants M and k it satisfies the "growth restriction"
(2)
226
CHAP. 6
Laplace Transforms
(The growth restriction (2) is sometimes called "growth of exponential order," which may
be misleading since it hides that the exponent must be kt, not kt 2 or similar.)
f(t) need not be continuous, but it should not be too bad. The technical term (generally
used in mathematics) is piecewise continuity. f(t) is piecewise continuous on a finite interval
a ~ t ~ b where f is defined, if this interval can be divided into finitely many subintervals
in each of which f is continuous and has finite limits as t approaches either endpoint of such
a subinterval from the interior. This then gives finite jumps as in Fig. 114 as the only possible
discontinuities, but this suffices in most applications, and so does the following theorem.
a
\",
b
Fig. 114. Example of a piecewise continuous function fIt).
(The dots mark the function values at the jumps.)
THEOREM 3
Existence Theorem for Laplace Transforms
If f(t) is defined alld piecewise colltinuolls on every finite imerval on the semi-alCis
t ~ 0 and satisfies (2) for all t ~ 0 and some constants M and k, then the Laplace
transform ;t(n exists for all s > k.
PROOF
Since f(t) is piecewise continuou~, e-stf(t) is integrable over any finite interval on the
t-axis. From (2), assuming that s > k (to be needed for the existence of the last of the
following integrals), we obtain the proof of the existence of ;t(n from
Note that (2) can be readily checked. For instance, cosh t < e t , t n < n!e t (because t n /ll!
is a single term of the Maclaurin series), and so on. A function that does not satisfy (2)
t2
for any M and k is e (take logarithms to see it). We mention that the conditions in
Theorem 3 are sufficient rather than necessary (see Prob. 22).
Uniqueness. If the Laplace transform of a given function exists, it is uniquely
determined. Conversely, it can be shown that if two functions (both defined on the positive
real axis) have the same transform. these functions cannot differ over an interval of positive
length, although they may differ at isolated points (see Ref. [AI4] in App. 1). Hence we
may say that the inverse of a given transform is essentially unique. In particular, if two
cominuolls functions have the same transform, they are completely identical.
--.
:.:- :
11-20
I
LAPLACE TRANSFORMS
Find the Lapial:e transforms of the following functions.
Show the details of your work. (a. b, k, w, B are constants.)
1. t 2
-
2t
2. (t2 - 3 f
4. sin 2 4r
6. e- t sinh 5t
3. cos 2,Trt
5. e 2t cosh t
7. cos (wt
9. e3a-2bt
+
B)
8. sin (3t - ~)
10. -8 sin O.2t
SEC. 6.2
Transforms of Derivatives and Integrals. ODEs
kD _
11. sin t cos t
13.
12. (t
14.
1~
ll
16.
I
I
I
I
a
b
k~
28. (Inverse transform) Prove that :£-1 is linear. Hint.
Use the fact that :£ is linear.
129-401 INVERSE LAPLACE TRANSFORMS
Given F(s) = :£(f), find f(t). Show the details. (L, n, k, a,
17 are constants.)
29.
31.
18.
4.1 .1
b
2
170 b
1)3
klI=L
b
15.
+
227
k~
.1
2
4
31T
+
~
-
3.1
.1
33.
2
30.
+
12
5
n1TL
L 2 s2
+
32.
34.
n2~
b
b
19.
'~
-1
I
I
20.
1~
I
I
37.
36.
+ 4.1
1
v3)(s
(.I -
39.
22. (Existence) Show that :£(llVt) = ~. [Use
(30) r@ = V; in App. 3.1.J Conclude from this that
the conditions in Theorem 3 are sufficient but not
necessary for the existence of a Laplace transform.
23. (Change of scale) If :£(f(t» = F(s) and c is any
positive constant, show that .'£(f(ct» = F(s/c)/c. (Hint:
Use (1).) Use this to obtain :£(cos wt) from :£(cos t).
24. (Nonexistence) Show that e
condition of the form (2).
t2
does not satisfy a
25. (Nonexistence) Give simple examples of functions
(defined for all x ~ 0) that have no Laplace transform.
26. (Table 6.1) Derive formula 6 from formulas 9 and lO.
27. (Table 6.1) Convert Table 6.1 from a table for finding
transforms to a table for finding inverse transforms (with
obvious changes. e.g .. :£-l(lIs n ) = t n - 1 /(n - I)!. etc.).
+
16
-
16
10
2.1
+Yz
20
(.I -
(k
L
+ Vs)
1
-----2
.1
5
.I
5
+
+
38.
40.
+ 4)
1)(.1
.I
k~l
2
-----'
21. Using :£(f) in Prob. 13, find :£(f1), where fN) = 0 if
t ~ 2 and f1(r) = 1 if t > 2.
6.2
.1
2
.1
2
4
8
35.
2.1
+ 1)2
+ k2
18.1 - 12
2
9.1 - 1
(.I
+
+ b)
a)(s
APPLICATIONS OF THE FIRST SHIFTING
THEOREM (s-SHIFTING)
141-541
In Probs. 41--46 find the transform. In Probs. 47-54 find
the 1l1verse transform. Show the details.
41. 3.Ste 2 . 4t
42. - 3t 4 e- O . 5f
at
43. 5e- sin wt
44. e- 3t cos 1Tt
45. e-kt(o cos t + 17 sin t)
46. e-t(ao + a1t + ... + o.J n )
47.
7
48.
(.I - 1)3
Vs
49.
+ Yz)3
2
+ 4.1 +
2
+
50.
15
.1
.1
1T)2
1)2 + 4
(.I -
4.1 -
52.
29
1T
53.
+
.1-6
(.I
51.
1T
(.I
10m
+
24~
54.
.1
2
-
6.1
2
+ 18
2.1 - 56
.1
2
-
4.1 -
12
Transforms of Derivatives and Integrals.
ODEs
The Laplace transform is a method of solving ODEs and initial value problems. The crucial
idea is that operations of calculus on functions are replaced by operations of algebra
on transfonns. Roughly, differentiation of f(t) will correspond to multiplication of 5£(f)
by s (see Theorems 1 and 2) and integration of f(t) to division of 5£(f) by s. To solve
ODEs, we must first consider the Laplace transform of derivatives
CHAP.6
228
THE 0 REM 1
Laplace Transforms
Laplace Transform of Derivatives
The transforms of the first and second derivatives of f(t) satisfy
5£(j') = s5£(f) - f(O)
(1)
(2)
5£(f") = s25£(f) - sf(O) -
ff (0).
F0I711ula (1) holds if f(t) is contilluousforall t ~ 0 and satisfies the growth restriction
(2) ill Sec. 6.1 and f' (t) is piecewise continuous on every finite imen'al on the semiaxis t ~ O. Similarly, (2) holds if f and f' are continuous for all t ~ 0 and satisfy
the growth restriction and f" is piecewise continuous on every finite interval on the
semi-ar:is t ~ O.
PROOF
We prove (l) first under the additional assumption that j' is continuous. Then by the
definition and integration by parts,
Since f satisfies (2) in Sec. 6.1, the integrated part on the right is zero at the upper limit
when s > k, and at the lower limit it contributes - f(O). The last integral is 5£(f). It exists
for s > k because of Theorem 3 in Sec. 6.1. Hence .c£(j' ) exists when s > k and (1) holds.
If j' is merely piecewise continuous, the proof is similar. In this case the interval of
integration of f' must be broken up into parts such that j' is continuous in each such part.
The proof of (2) now follows by applying (1) to f" and then substituting (1), that is
.c£(f")
=
s.c£(f') - reO)
=
s[s.c£(f) - f(O)]
=
s 2.c£(f) - sf CO) - rCO).
•
Continuing by substitution as in the proof of (2) and using induction, we obtain the
following extension of Theorem 1.
THEOREM 2
Laplace Transform of the Derivative f
(n)
of Any Order
Let f, j', . .. , In-ll be continuous for all t ~ 0 and satisfy the growth restriction
(2) in Sec. 6.1. Furthermore, let In} be piecewise continuous on every finite interval
on the semi-axis t ~ O. Then the transform of In} satisfies
E X AMP L E 1
Transform of a Resonance Term (Sec. 2.8)
Let f(t) =
by (2),
!
sin Cd!. Then f{O) = 0, f' (t) = sin Cdr
s - 2 - Cd2 f£(f) = s 2 f£(f),
f£(f" ) = 2Cd - 2 -
s +
Cd
+ Cdr cos Cdr, f' (0)
thus
=
0,
I' =
2Cd cos Cdr - Cd2 r
sin Cdr. Hence
•
SEC. 6.2
229
Transforms of Derivatives and Integrals. ODEs
E X AMP L E 2
Formulas 7 and 8 in Table 6.1, Sec. 6.1
This is a third derivation of ;£(cos wt) and ;£(sin wt); cf. Example 4 in Sec. 6.1. Let f(t) = cos wt. Then
f(O) = I, f' (0) = 0, f"(t) = _w2 cos wt. From this and (2) we obtain
XU") = s2:£(f) -
.I
.I
-w2X(f).
=
P( cos wt) = - 2 - - 2
.I + w
By algebra,
Similarly, let g = sin wt. Then g(O) = 0, g' = w cos wf. From this and (I) we obtain
X(g')
=
s'£(g)
= w:£(cos
w
.:r(sin wt) = -:£(cos wt) =
Hence
wt).
.I
•
w
-2--2 .
.I
+ w
Laplace Transform of the Integral of a Function
Differentiation and integration are inverse operations, and so are multiplication and division.
Since differentiation of a function J(t) (roughly) corresponds to multiplication of its
transform ::£(f) by s, we expect integration of J(t) to correspond to division of ::£(f) by s:
THEOREM 3
Laplace Transform of Integral
Let F(s) denote the transfonn of a function J(t) which is piecewise continuous for
t ~ 0 and satisfies a growth restriction (2), Sec. 6.1. Then, for s > 0, s > k, and
t> 0,
(4)
PROOF
thus
Denote the integral in (4) by get). Since J(t) is piecewise continuous, get) is continuous,
and (2), Sec. 6.1, gives
Ig(t)1
=
I
I
tt
~ (IJ(T)I dT ~ M ( e
t
(J(T) dT
10
10
kT
10
M
M
= _(ekt
dT
-
k
L):S _ekt
-
k
(k> 0).
This shows that get) also satisfies a growth restriction. Also, g' (t) = J(t), except at points
at which J(t) is discontinuous. Hence g' (t) is piecewise continuous on each finite interval
and, by Theorem 1, since g(O) = 0 (the integral from 0 to 0 is zero)
::£{f(t)} = ::£{g'(t)} = s::£{g(t)} - g(O)
= s::£{g(t)}.
Division by s and interchange of the left and right sides gives the first formula in (4),
•
from which the second follows by taking the inverse transform on both sides.
E X AMP L E 3
Application of Theorem 3: Formulas 19 and 20 in the Table of Sec. 6.9
I
Using Theorem 3. find the inverse of
2
.1(.1
I
2
+ w )
and
2
2
.I (.I
+
2
W )
Solution. From Table 6.1 in Sec. 6.1 and the integration in (4) (second formula with the sides interchanged)
we obtain
C
:£
_I{ __I_}
s
2
+ w2
=
sin wt
w
'
;£
-I{
2
.1(.1
I
2
+ w)
}
=
It
0
sin
WO'
--
w
dO' =
I
(l - cos wt).
w
2
230
CHAP. 6
Laplace Transforms
This is formula 19 in Sec. 6.9. lmegraring this result again and using (4) as before, we obtain formula 20 in
Sec. 6.9:
;;e-l{ 2 21+ 2} = ~ ft(1 - cos wr) dr = [-;. s (s
w )
W O W
JI
sin :r
sin wt
w
0
W
w3
2
It is typical that results such as these can be found in several ways. In this example. try partial fraction
reduction.
•
Differential Equations, Initial Value Problems
We shall now discuss how the Laplace transfonn method solves ODEs and initial value
problems. We consider an initial value problem
(5)
y" + ay'
+ by =
ret),
yeO)
= Ko,
y'(O)
= Kl
where a and b are constant. Here ret) is the given input (driving force) applied to the
mechanical or electrical system and yet) is the output (response to the input) to be obtained.
In Laplace's method we do three steps:
Step 1. Setting up the subsidiary equation. This is an algebraic equation for the transfonn
Y = .;£(y) obtained by transforming (5) by means of (J) and (2), namely,
[S2y - sy(O) - ),'(0)]
+
a[sY - yeO)]
+ bY
=
R(s)
where R(s) = ;£(r). Collecting the Y-tenns, we have the subsidiary equation
(S2
+
as
+
b)Y
= (s +
a)y(O)
+ y' (0) +
R(s).
Step 2. Solution of the subsidiary equation by algebra. We divide by
use the so-called transfer function
(6)
s2
+ as + band
1
Q(s)
=
s2
+ as + b
(Q is often denoted by H. but we need H much more frequently for other purposes.) This
gives the solution
(7)
Yes)
=
[(s
+
a)y(O)
+ y' (O)]Q(s) +
R(s)Q(s).
If yeO) = y' (0) = 0, this is simply Y = RQ; hence
Q=
Y
;£(output)
R
;£(input)
and this explains the name of Q. Note that Q depends neither on ret) nor on the initial
conditions (but only on a and b).
Step 3. III version ofY to obtain y = ;£-1(1'). We reduce (7) (usually by partialfractiolls
as in calculus) to a sum of tenns whose inverses can be found from the tables (e.g .• in
Sec. 6.1 or Sec. 6.9) or by a CAS, so that we obtain the solution yet) = ?l(y) of (5).
SEC. 6.2
231
Transforms of Derivatives and Integrals. ODEs
E X AMP L E 4
Initial Value Problem: The Basic Laplace Steps
Solve
Solution.
y' (0)
y(o) = 1,
y" - y = t,
I.
=
Step 1. Prom (2) and Table 6.1 we get the subsidiary equation [with Y = .P(yl]
thus
Step 2. The transfer function is Q = 1/(.1'2 - 1), and (7) becomes
Y = (.I'
+
+
I)Q
I
.1'2 Q
.I' + I
.1'2 _ 1
=
+
.1'2(.1'2 - 1)
Simplification and partial fraction expansion gives
1 (1
---
Y=--+
s-]
.1'2-1
~).
s
-
Step 3. From this expression for Y and Table 6.1 we obtain the solution
_,-1 _ _1{_I} '_I{_1 }
y(t) -:£
(Y) -
:£
+Y
.1'-1
2
.1'-1
-
c -1{~}
Y
2
.I'
_
-
.
et + smh
t _ t.
•
The diagram in Pig. liS summarizes our approach.
s-space
t-space
Subsidiary equation
Given problem
y" -y = t
(s2 - I)Y = s + 1 + 1Is2
~
y(O) =1
y'(O) =1
t
Solution of subsidiary equation
Solution of given problem
yet) = e' + sinh t -
-E-
t
Y= _1_ +_I__ ...l
s-1
Fig. 115.
E X AMP L E 5
s2-1
s2
Laplace transform method
Comparison with the Usual Method
Solve the initial value problem
y"
Solution.
+ y' + 9y
y(O) = 0.16,
= 0,
)"(0) =
o.
From (1) and (2) we see that the subsidiary equation is
s2y - 0.16.1'
+ sY - 0.16 + 9Y
=
0,
ts 2
thus
+
.I'
+ 9)Y =
0.16(.1'
+
I).
The solution is
0.16(.1'
+
1)
+ ~) + 0.08
+ ~)2 + ~
0.16(.1'
(.I'
Hence by the first shifting theorem and the formulas for cos and sin in Table 6.1 we obtain
yet)
= Y-\y) =
e- tI2 (0.16 cos
= e -0.5\0.16
08
f35 t + 1°.
V4
2 V35
cos 2.96t
sin
f35 t)
V4
+ 0.027 sin 2.96tl.
This agrees with Example 2, Case (TIl) in Sec. 2.4. The work was less.
•
CHAP. 6
232
Laplace Transforms
Advantages of the Laplace Method
1.
Solving a nonhomogeneous ODE does not require first solving the
homogeneous ODE. See Example 4.
2. Initial values are automatically taken care of See Examples 4 and 5.
3.
E X AMP L E 6
Complicated inputs ret) (right sides of linear ODEs) can be handled very
efficiently, as we show in the next sections.
Shifted Data Problems
This means initial value problems with initial conditions given at some 1 = 10 > 0 instead of t = O. For such
a problem set 1 = 7 + to, so that t = to gives 7 = 0 and the Laplace transform can be applied. For instance,
solve
y" + Y = 2t,
Solution.
We have to =
!7T
y" + Y =
and we set r = 7 + !7T. Then the problem is
2(1 + !7T),
vi
y'(O) = 2 -
yeO) = !7T,
where y(7) = y(t). Using (2) and Table 6.1 and denoting the transform of y by Y, we see that the subsidiary
equation of the "shifted·' initial value problem is
2~
2
17T
S Y - S·!7T - (2 - V 2) + Y = 2" + ~
S
Solving this algebraically for
Y,
(s
2
+
-
I)Y =
!7T
2
(s2
+ l)s2 +
(s2
2-
!7TS
+ l)s +
s2
=
.;e-l(y) = 2(7 - sin 7) +
=
1
-
Now 7 = t - 47T, sin 1 =
27 +
1
Yz
(sin t -
Using (l) or (2), find ;£(f) if fU) equals:
2.
3. sin2 wt
2
5. sinh at
7.
1
!7T(I -
sin ~'1Tr
1
7rt
6. cosh 2
8. sin4 t
cos 7) +
=
1), and the last two terms give cos
!7T cos 7 + (2 - Yz) sin 7
!t
(Use Prob. 3.)
9. (Derivation by different methods) It is typical that
various transforms can be obtained by several methods.
Show this for Prob. 1. Show it for ;£(cos 2 !t) (a) by
•
= 21 - sin t + cos t.
expressing cos2 !t in terms of cos t, (b) by using
Prob.3.
110-241
cos 51
4. cos 2
Yz
cos 1), so that the answer (the solution) is
OBTAINING TRANSFORMS BY
DIFFERENTIATION
1. te kt
~
2 - V 2.
!7T - Yz sin 7.
y
11-81
2"1 7TS +
+
S
+ 1 + ~.
The inverse of the first two terms can be seen from Example 3 (with w
and sin,
y
2
s
2" +
we obtam
_
Y =
thus
S
INITIAL VALUE PROBLEMS
Solve the following initial value problems by the Laplace
transform. (If necessary, use partial fraction expansion as
in Example 4. Show all details.)
+
y' +
10. y'
4y
= O.
11.
h
= 17
yeO) = 2.8
sin 2t,
12. y" - y' - 6y = 0,
y'(O) = 13
yeO)
= -}
yeO) = 6,
SEC. 6.3
Unit Step Function. t-Shifting
13. y" -
h
233
yeO) = 4,
= 0,
14. y" - 4y' + 4y = 0,
y' (0) = 3.9
yeO) = 2.1,
15. y" + 2y' + 2y
yeO) = 1,
f(t)
0,
=
own to illustrate the advantages of the present method
(to the extent we have seen them so far).
y' (0) = 0
I
:/f(a-O)
1
:.....--f(a + Ol
y'(O) = -3
16. y" + ky' - 2k 2 y = O.
y' (0) = 2k
17. y" + 7y' + 12y
y'(0) = -10
18. y"
=
+ 9y = lOe-t,
+ 3y' + 2.2Sv
19. y"
y'(O) = 31.S
21e
3t
o
yeO) = 3.S,
,
= 9t
3
+
64.
/(0)
yeo)
=
=
0
1,
y(O) = 3.2.
(a)
.;£(t cos wt) =
+ 2y' + Sy = SOt - ISO,
y'(3) = 14
+ 0)
(s
(e)
.;£(t cosh at) =
(S2 _
2 2
w)
a 2 )2
2as
(f)
OBTAINING TRANSFORMS BY
INTEGRATION
127-341
Using Theorem 3, find f(t) if .;£(f) equals:
1
10
28.
27.
S3 - "'S2
S2 + s/2
29.
- f(a - O)]e-a.s.
(c) Verify (1 *) for f(t) = e- t if 0 < t < 1 and 0 if
t> 1.
(d) Verify (l *) for two more complicated functions of
your choice.
(e) Compare the Laplace transform of solving ODEs
with the method in Chap. 2. Give examples of your
w2
+
y(3) = -4,
25. PROJECT. Comments on Sec. 6.2. (a) Give reasons
why Theorems 1 and 2 are more important than
Theorem 3.
(b) Extend Theorem 1 by showing that if f(t) is
continuous, except for an ordinary discontinuity (finite
jump) at some t = a (> 0), the other conditions
remaining as in Theorem 1, then (see Fig. 116)
(1*) .;£(f') = s.;£(f) - f(O) - [f(a
S2 2
and from this and Example 1: (b) formula 21, (c) 22,
(d) 23 in Sec. 6.9,
S
24. y"
6.3
Formula (1*)
26. PROJECT. Further Results by DifferentiatiolL
Proceeding as in Example 1, obtain
y(2) = 4
21. (Shifted data) y' - 6y = 0,
22. y" - 2y' - 3y = 0,
yO) = -3,
y'(l) = -17
y(l) = 4,
23. y" + 3y' - 4y = 6e 2t - 2 •
=
a
Fig. 116.
y(O) = 0,
20. y" - 6y' + Sy = 29 cos 2t.
y' (0) = 6.2
y'(l)
i~
yeO) = 2.
31.
33.
S3 -
ks 2
S
S3 -
Ss
S4 -
4s 2
30.
32.
34.
S4
+
S3
+ 9s
S4
+
S2
2
7f2s 2
35. (Partial fractions) Solve Probs. 27, 29, and 31 by
using partial fractions.
Unit Step Function. f-Shifting
This section and the next One are extremely important because we shall now reach the point
where the Laplace transform method shows its real power in applications and its superiority
over the classical approach of Chap. 2. The reason is that we shall introduce two auxiliary
functions, the unit step function or Heaviside function u(t - a) (below) and Dirac's delta
(jet - a) (in Sec. 6.4). These functions are suitable for solving ODEs with complicated
right sides of considerable engineering interest, such as single waves, inputs (driving forces)
that are discontinuous or act for some time only, periodic inputs more general than just
cosine and sine, or impUlsive forces acting for an instant (hammerblows, for example).
234
CHAP. 6
Laplace Transforms
Unit Step Function (Heaviside Function)
u(t - a)
The unit step function or Heaviside function u(t - a) is 0 for t < a, has a jump of size
I at t = a (where we can leave it undefined), and is I for t > a, in a formula:
(1)
u(t - a) =
if t < a
{~
(a
~
0).
if t > a
Figure 117 shows the special case u(t), which has its jump at zero, and Fig. 118 the general
case u(t - a) for an arbitrary positive a. (For Heaviside see Sec. 6.1.)
The transform of u(t - a) follows directly from the defining integral in Sec. 6.1,
~{u(t -
a)}
=
lXo
e - stU(l
-
=
JX e - st • 1 dt =
st
_ e-
a
here the integration begins at t = a
(~
~{u(t
(2)
a) dt
I"" ;
S
t=a
0) because u(t - a) is 0 for 1 < a. Hence
- a)}
(S
>
0).
S
The unit step function is a typical "engineering function" made to measure for
engineering applications. which often involve functions (mechanical or electrical
driving forces) that are either "off' or "on." Multiplying functions f(t) with U(l - a).
we can produce all sorts of effects. The simple basic idea is illustrated in Figs. 119
and 120. In Fig. 119 the given function is shown in (A). In (B) it is switched off
between t = 0 and t = 2 (because u(t - 2) = 0 when t < 2) and is switched on
beginning at t = 2. In (C) it is shifted to the righl by 2 units, say, for instance, by 2 secs,
so that it begins 2 secs later in the same fashion as before. More generally we have the
following.
Let f(t) = 0 for all negative t. Then f(t - a)u(t - a) with a
(translated) to the right by the amount a.
> 0
is f(t) shifted
Figure 120 shows the effect of many unit step functions, three of them in (A) and
infinitely many in (B) when continued periodically to the right: this is the effect of a
rectifier that clips off the negative half-waves of a sinuosidal voltage. CAUTION! Make
sure that you fully understand these figures, in particular the difference between parts (B)
and (C) of Figure 119. Figure 119(C) will be applied next.
o
Fig. 117.
o
t
Unit step function u(tJ
Fig. 118.
a
Unit step function u(t - oj
SEC. 6.3
235
Unit Step Function. t-Shifting
5'L 5f\
-5 V -5 V
(tt)
I
o
(B)
(A) ((t) = 5 sin t
Fig. 119.
I
2 11: 211:
{(t)u(t -
t
2)
0
(el
2 11:-1-2211:+2
{(t - 2)u(t -
2)
Effects of the unit step function: (A) Given function.
(B) Switching off and on. (e) Shift.
k~'-------'4
I
I
,
,
1.
6 ---t
~
-k
02468lO
(B) 4 sin (~11:t)[u(t) - u(t - 2) + u(t - 4) - + ... ]
(A) k[u(t - 1) - 2u(t - 4) + u(t - 6)]
Fig. 120.
Use of many unit step functions.
Time Shifting (t-Shifting): Replacing t by t - a in f{t)
The first shifting theorem ("s-shifting") in Sec. 6.1 concerned transforms pes) = ~{f(t)}
and F(s - a) = ~{eatJ(t)}. The second shifting theorem will concern functions J(t) and
J(t - a). Unit step functions are just tools, and the theorem will be needed to apply them
in connection with any other functions.
THEOREM 1
Second Shifting Theorem; Time Shifting
If J(t)
has the
tran~fonl!
let)
(3)
F(s), then the "shifted function"
if t
= J(t - a)u(t - a) = {
J(t - a)
has the
(4)
Or,
(4*)
tran~fonn
e-aSP(s). That is,
~{f(t
< a
0
if t > a
if ~{f(t)} = pes), then
- a)uCt - a)}
= e-asF(s).
!f we take the inverse on both sides, we can write
J(t - a)u(t - a) =
~-l{e-a.sp(s)}.
Practically speaking, if we know pes), we can obtain the transform of (3) by multiplying
pes) bye-as. In Fig. 119, the transform of 5 sin tis F(s) = 5/(S2 + 1), hence the shifted
function 5 sin (t - 2) u(t - 2) shown in Fig. 119(C) has the transform
CHAP. 6
236
PROOF
Laplace Transforms
We prove Theorem 1. In (4) on the right we use the definition of the Laplace transform,
writing 'T for t (to have t available later). Then, taking e- as inside the integral, we have
x
:x
e-asp(s)
Lo
e- as
=
e- S7J('T) d'T
=
L
e-scT+a>f('T) d'T.
0
Substituting 'T + a = t, thus 'T = t - a, d'T = dt, in the integral ( 'AUTION, the lower limit
changes!), we obtain
J
00
e-asp(s) =
e-stf(t - a) dt.
a
To make the right side into a Laplace transform, we must have an integral from 0 to 00,
not from a to IX. But this is easy. We multiply the integrand by u(l - a). Then for t from
o to a the integrand is 0, and we can write, with f as in (3),
(Do you now see why u(t - a) appears?) This integral i., the left side of (4), the Laplace
transform of f(t) in (3). This completes the proof.
•
E X AMP L E 1
Application of Theorem 1. Use of Unit Step Functions
Write the following function using unit step functions and find its rransform.
iro <
if I
if
Solulioll.
I
< I
< I <!1'
I> !1'.
(Fig. 121)
Step 1. In terms of unit step functions,
f(t)
= 2(1 -
u(t -
I))
+ !t2(1l(1
-
!7T»
I) - lI(t -
+ (cos I)U(I -
!1').
Indeed. 2(1 - 11(1 - I)) gives I(t) for 0 < I < 1, and so on.
I(tl in the form 1(1 - a)u(1 - a). Thus, 2(\ - 1I(t - I»
remains as it is and gives the transform 2(1 - e -')1.1'. Then
Step 2. To apply Theorem \, we must write each term in
{12
;e
21 11(1 -
} (I
I)
~ 9., "2(t - I)
2+
(t -
1)
+
I)} (17i + --;;:1 + 1)
2
lI(t -
1)
~
2s
e-
S
Together,
I
"3"
( s
I + -I
+ ""2
s
2.1'
) e- s -
( -I
.1'3
+ - 7T + -1'2 ) e- ws/2
2s2
8s
-
I _ e- ws12
__
s2 + I
.
SEC. 6.3
237
Unit Step Function. t-Shifting
If the conversion of f( t) to f( t - a) is inconvenient. replace it by
~{f(t)u(t
(4**)
- a)}
=
e-as~{f(t
(4**) follows from (4) by writing f(t - a) = get). hence f(t) = get
+
a)}.
+ a) and then again writing f
for g. Thus.
as before. Similarly for .'£{~t2u(t - ~'17)}. Finally, by (4**).
fIt)
2
Or--L~--~--~r----L--~~--~---T----L---~--~
1T
21T
41T
-1
Fig. 121. t(t} in Example 1
E X AMP L E 2
Application of Both Shifting Theorems. Inverse Transform
Find the inverse transform f(t) of
Solution. Without the exponential functions in the numerator the three terms of F(s) would have the inverses
(sin '17t)/'17, (sin '17t)/'17, and te -21; because IIs 2 has the inverse t, so that 1/(s + 2)2 has the inverse te -2t by the
first shifting theorem in Sec. 6.1. Hence by the second shifting theorem (t-shifting),
1
f(t) = - sin ('17(t - 1» u(t - 1)
'17
1
+ -
'17
sin ('17(t - 2» u(t - 2)
+ (t
- 3)e -2(t-3) u(t - 3).
Now sin ('17t - '17) = -sin '17t and sin ('17t - 2'17) = sin '17t, so that the second and third terms cancel each other
when t > 2. Hence we obtain f(t) = 0 if 0 < t < 1, -(sin '17t)/'17ifl < t < 2,0 if2 < t < 3, and (t - 3)e -2(t-3)
if t > 3. See Fig. 122.
•
0.3
0.2
0.1
OL------L______L-____~______~____~~~==~___
o
2
3
4
5
6
Fig. 122. t(t} in Example 2
E X AMP L E 3
Response of an RC-Circuit to a Single Rectangular Wave
Find the current i(t) in the RC-circuit in Fig. 123 if a single rectangular wave with voltage Vo is applied. The
circuit is assumed to be quiescent before the wave is applied.
238
CHAP. 6
Laplace Transforms
c
vet)
v(t)
a
R
Fig. 123.
RC-circuit, electromotive force v(t), and current in Example 3
Solutioll.
The input is Vo[lI(t equation (see Sec. 2.9 and Fig. 123)
+
Ri(t)
a) -
= Ri(t) + ~
q(t)
C
b)]. Hence the circuit is modeled by the integro-differential
11([ -
C
f
t
i(T) dT
0
= vet) = Vo [lI(t -
a) -
tI(t - b)l.
Using Theorem 3 in Sec. 6.2 and formula (I) in this section, we obtain the subsidiary equation
RI(s)
+ Irs)
=
Vo [e -as _ e -bsj.
sC
s
Solving this equation algebmically for I(s). we get
where
VoIR
F(s) = - - - " - - s + J/(RC)
and
the last expression being obtained fi'om Table 6.1 in Sec. b.l. Hence Theorem 1 yields the solution (Fig. 123)
that is. i(l) = 0 if t < a. and
ifa<l<b
if a > b
•
E X AMP L E 4
Response of an RLC-Circuit to a Sinusoidal Input Acting Over a Time Interval
Find the response (the current) of the RLC-circuit in Fig. 124, where E(t) is sinusoidal. acting for a short time
interval only. say.
E(t)
= 100 sin 400t if 0 < t < 27T
and
E(t) = 0 if t
>
27T
and current and charge are initially zero.
Solution.
The electromotive force E(t) can be represented by (100 sin 400t){l - u(t - 27T)). Hence the
model for the current i(t) in the circuit is the integro-differential equation (see Sec. 2.9)
O.li'
+
I Ii
+
100
f
t
i(T) dT
= (100 sin 400t)(1
- tI(t - 27T»,
;(0)
= O.
o
From Theorems 2 and 3 in Sec. 6.2 we obtain the subsidiary equation for Irs) = 5£(i)
/
O.ls/ + 111 + 100-
s
2
100' 400s (-sl _ e2
s2 + 400
s
r.S).
/(0)
= O.
SEC. 6.3
239
Unit Step Function. t-Shifting
Solving it algebraically and noting that .1 2 + 110.1 + 1000 = (s + 10)(.1 + 100), we obtain
I(s) =
(.I
2
se- .,,-s
s2 + 4002
(s
1000·400
+ 10)(.1 + 100)
.1 2 + 4002 -
)
.
For the first term in the parentheses ( ... ) times the factor in front of them we use the partial fraction expansion
400000.1
A
(s + 10)(.1 + 100)(s2 + 4002)
.1+10
+
B
+
s+100
Ds
+K
~-----=
s2+4002
Now determine A, B, D, K by your favorite method or by a CAS or as follows. Multiplication by the common
denominator gives
400000.1 = A(s + 100)(.1 2 + 4002) + B(s + 1O)(s2 + 400 2) + (Ds + K)(s + 10)(.1 + 100).
We sets = -10 and -100 and then equate the sums of the s3 and .1 2 terms to zero, obtaining (all
values rounded)
-4000000 = 90(102 + 4002)A,
(s = -10)
(s
Since K
=
A = -0.27760
-40000000 = -90(1002 + 4002)B,
-100)
B = 2.6144
+ D,
(s3- terms )
0= A + B
(s2-terms)
0= 100A + lOB + llOD + K,
D = -2.3368
= 258.66 = 0.6467' 400, we thus obtain for the first term I1 in I =
2.6144
2.3368.1
s + 100
.1 2 + 4002
0.2776
h = - ----- + -----.I
+ 10
K
=
258.66.
II - 12
0.6467 . 400
+ ---,;;--------;:2
s2 + 400
From Table 6.1 in Sec. 6.1 we see that its inverse is
i1 U) = -0.2776e- lOt + 2.6144e- lOOt - 2.3368 cos 400t + 0.6467 sin 400/.
This is the cunent i(t) when 0 < t < 27T. It agrees for 0 < r < 27T with that in Example 1 of Sec. 2.9 (except
for notation), which concerned the same RLC-circuit. Its graph in Fig. 62 in Sec. 2.9 shows that the exponential
terms decrease very rapidly. Note that the present amount of work wa, substantially less.
The second term h of 1 differs from the first term by the factor e -2.,,-s. Since cos 400(1 - 27T) = cos 400t
and sin 400(1 - 27T) = sin 400t, the second shifting theorem (Theorem I) gives the inverse i2(t) = 0 if
o < t < 27T. and for > 27T it gives
i2(t)
= -0.2776e- lOCt - 2.,,-)
+ 2.6144e- lOO \t-2r.) - 2.3368 cos 400t + 0.6467 sin 400t.
Hence in i(t) the cosine and sine terms cancel, and the current for t > 27T is
i(t) = -0.2776(e- lOt - e- lOCt - 2.,,-)
+ 2.6144(e- lOOt
It goes to zero very rapidly, practically within 0.5 sec.
E(t)
Fig. 124.
RLC-circuit in Example 4
_ e-lOOCt-2.,,-).
•
240
CHAP. 6
Laplace Transforms
1. WRITING PROJECT. Shifting Theorem. Explain
and compare the different roles of the two shifting
theorems, using your own formulations and examples.
\2-13\
UNIT STEP FUNCTION AND SECOND
SHIFTING THEOREM
Sketch or graph the given function (which is assumed to
be zero outside the given interval). Represent it using unit
step functions. Find its transform. Show the details of your
work.
2.
4.
6.
8.
10.
12.
< t < 1)
3. e t (0 < t < 2)
5. t 2 (I < t < 2)
sin 3t (0 < t < 'IT)
7. cos 'lTf (1 < t < 4)
t 2 It > 3)
t
I - e- (0 < t < 'IT) 9. t (5 < t < 10)
11. 20 cos 7ft (3 < f < 6)
sin wt (t > 6 'IT/ w)
13. e'Ut (2 < t < 4)
sinh t (0 < t < 2)
t (0
\14-22\
INVERSE TRANSFORMS BY THE
SECOND SHIFTING THEOREM
30. y" - 16)' = r(t),
o if t > 4:
31.
ret) = 48e
y(O)
21
if 0
= 3.
<
t
< 4 and
y'CO) = -4
y" + y' - 2)" = 1'(1), r(t) = 3 sin t - cos t if
o < t < 2'IT and 3 sin 2t - cos 2t if t > 2 'IT;
yeO) = 1,
y'(O) = 0
8y' + 15)' = ret), r(t) = 35e 21 if
t < 2 and 0 if t > 2;
yeO) = 3.
y'(O) = -8
32. y"
+
o<
33. (Shifted data) v" + 4v = 8t 2 if 0 < t < 5 and 0
if t > 5; y{ 1) ~ I + 'cos 2, y' (1) = 4 - 2 sin 2
+ 2)"' + 5)' = 10 sin t if 0 < t < 2'IT and 0 if
t > 2'IT; Y('IT) = I, y'('IT) = 2e-r. - 2
34. y"
MODELS OF ELECTRIC CIRCUITS
35. (Discharge) Using the Laplace transform, find the
charge q(t) on the capacitor of capacitance C in Fig. 125
if the capacitor is charged so that its potential is Vo and
the switch is closed at f = O.
Find and sketch or graph f(t) if ;e(n equals:
14. se- s /(s2
4s
15. e- /s
16. S-2 -
w 2)
+
R
2
+
(s-2
17. (e- 27TS -
s-l)e- S
e- Br.s)/(s2
+
Fig. 125.
I)
+ 2s + 2) 19. e- 2s /s 5
e-s+k)/(s - k) 21. se- 3s /(s2 -
18. e- 7Ts /{s2
20. (1 -
22. 2.5(e- 3 . Bs
\23-34\
4)
e- 2 . 6S )/s
-
INITIAL VALUE PROBLEMS, SOME WITH
DISCONTINUOUS INPUTS
Using the Laplace transform and showing the details, solve:
23. y"
+ 2y' + 2v
y' (0)
=
=
O.
0,
yeO) =
+
Y = 0,
\36-38\
RC-CIRCUIT
Using the Laplace transform and showing the details. rmd
the current i(f) in the circuit in Fig. 126 with R = 10 fl and
C = 10- 2 F, where the current at t = 0 is assumed to be
zero. and:
36.
100 V if 0.5 < f
Why does i(t) have jumps?
v(t) =
< 0.6 and 0 otherwise.
37. v = 0 if t < 2 and 100 (t - 2) V if t > 2
38. v = 0 if t < 4 and 14' 106 e -3t V if t > 4
1
24. 9)''' - 6)"'
Problem 35
yeO) = 3,
y'(O) = 1
25. y" + 4y'
+
13y = 145 cos 2t,
yeO) = 10,
C
y'(O) = 14
yeO) = ~,
26. y" + lOy' + 24y = 144t 2 ,
y'(0) = -5
+ 9y =
if t > 'IT;
27. y"
r(r), 1'(1) = 8 sin t if 0
yeO) = 0, y'(O) = 4
<
t
<
'IT and 0
28. y" + 3;-' + 2y = r(t), r(t) = 1 if 0 < t < 1 and
o if t > 1; yeO) = 0, y' (0) = 0
29. y"
+
t> 1;
y
r(t),
yeO)
r(t) = t if 0
=
y'(O)
=
0
<
t
< 1 and 0 if
L
Fig. 126.
/39-411
R
vet)
Problems 36-38
RL-CIRCUIT
Using the Laplace transform and showing the details, find
the current i(t) in the circuit in Fig. 127, assuming i(O) = 0
and:
SEC. 6.4
241
Short Impulses. Dirac's Delta Function. Partial Fractions
39. R = 10 fl, L = 0.5 H, v = 200t V if 0 < t < 2 and
Oift>2
C
40. R = 1 kfl (= 1000 fl), L = 1 H, v = 0 if
o < t < r., and 40 sin t V if t > 'iT'
41. R = 25 fl, L = 0.1 H, v = 490e- 5t V if
o < t < 1 and 0 if t > 1
L
Fig. 128.
L
u(t)
Problems 42-44
RLC-CIRCUIT
R
L
u(t)
Fig. 127.
142-441
Problems 39-41
LC-CIRCUIT
Usmg the Laplace transform and showing the details, find
the current i(t) in the circuit in Fig. 128, assuming zero
initial current and charge on the capacitor and:
42. L
=
o<
1 H, C = 0.25 F, v
t < 1 and 0 if t > 1
=
200(t -
tt
3
)
Using the Laplace transform and showing the details, find
the current i(t) in the circuit in Fig. 129. assuming zero
initial current and charge and:
45. R = 2 n, L = I H. C = 0.5 F. vet) = 1 kV if
o < t < 2 and 0 if t > 2
46. R = 4 n, L = I H, C = 0.05 F, v = 34e- t V
if 0 < t < 4 and 0 if t > 4
47. R = 2 n, L = I H, C = 0.1 F, v = 255 sin t V
if 0 < t < 2r. and 0 if t > 2r.
V if
43. L = I H, C = 10- 2 F, v = -9900 cos t V if
'iT' < t < 3 r. and 0 otherwise
44. L = 0.5 H, C = 0.05 F, v = 78 sin t V if
o < t < r. and 0 if t > r.
6.4
u(t)
Fig. 129.
Problems 45-47
Short Impulses. Dirac's Delta Function.
Partial Fractions
Phenomena of an impulsive nature. such as the action of forces or voltages over short
intervals of time, arise in various applications, for instance, if a mechanical system is hit
by a hammerblow, an airplane makes a "hard" landing, a ship is hit by a single high wave,
or we hit a tennisball by a racket, and so on. Our goal is to show how such problems are
modeled by "Dirac's delta function" and can be solved very efficiently by the Laplace
transform.
To model situations of that type, we consider the function
(1)
Ilk
h(t - a) = { 0
ifa~t~a+k
(Fig. 130)
otherwise
(and later its limit as k --'.> 0). This function represents. for instance. a force of magnitude
11k acting from t = a to t = a + k, where k is positive and small. In mechanics, the
integral of a force acting over a time intervalll ~ t ~ a + k is called the impulse of the
242
CHAP.6
Laplace Transforms
force; similarly for electromotive force!. E(t) acting on circuits. Since the blue rectangle
in Fig. 130 has area 1, the impulse of fk in (l) is
a+k I
a) dt =
dt = 1.
oak
J - -:
:x:
h
(2)
=
L fk(t -
To find out what will happen if k becomes smaller and smaller, we take the limit of fk
as k ~ 0 (k > 0). This limit is denoted by 8(t - a), that is,
8(t - a)
= k-O
lim h(t -
a).
8(t - a) is called the Dirac delta function 2 or the unit impulse function.
8(t - a) is not a function in the ordinary sense as used in calculus, but a so-called
generalizedjullction. 2 To see this, we note that the impulse lk of fk is I, so that from (1)
and (2) by taking the limit as k ~ 0 we obtain
if t
(3)
8(t -
a)
= {:
=
a
100 8(t -
and
a) dt = 1,
o
otherwise
but from calculus we know that a function which is everywhere 0 except at a single point
must have the integral equal to O. Nevertheless, in impulse problems it is convenient to
operate on 8(t - a) as though it were an ordinary function. In particular, for a continuous
function get) one uses the property [often called the sifting property of B(t - a), not to
be confused with shifting 1
IXo g(t) B(t -
(4)
a) dt
=
g(a)
which is plausible by (2).
To obtain the Laplace transform of 8(t - a), we write
fk(t - a)
I
= -
k
[u(t - a) - u(t - (a
+
k»]
r~Area=l
11k
t
Fig. 130.
a a+k
t
The function fk(t - 0) in (1)
2 PAUL DIRAC (1902-1984), English physicist, was awarded the Nobel Plize [jointly with the Austrian
ERWIN SCHRODINGER (1887-1961)] in 1933 for his work in quantum mechanics.
GeneralIzed functions are also called distributions. Their theory was created in 1936 by the Russian
mathematician SERGEI L'VOVICH SOBOLEV (1908-1989). and in 1945. under wider aspects, by the French
mathematician LAURENT SCHWARTZ (1915-2002).
SEC. 6.4
243
Short Impulses. Dirac's Delta Function. Partial Fractions
and take the transform [see (2)]
1
_ [e- as
ks
:£{h(t - a)}
e-Ca+k)s]
_
=
e- as
1 - e- ks
----
ks
We now take the limit as k----7 O. By l'H6pital's rule the quotient on the right has the limit
1 (differentiate the numerator and the denominator separately with respect to k, obtaining
se- ks and s, respectively, and use se-ks/s ----7 1 as k ----7 0). Hence the right side has the
limit e- as • This suggests defining the transform of 8(t - a) by this limit, that is,
:£(8(t - a)} = e- as •
(5)
The unit step and unit impulse functions can now be used on the right side of ODEs
modeling mechanical or electrical systems, as we illustrate next.
E X AMP L E 1
Mass-Spring System Under a Square Wave
Determine the response of the damped mass-spring system (see Sec. 2.8) under a square wave, modeled by (see
Fig. 131)
y" + 3y' + 2y = r(t) = u(r - 1) - u(t - 2).
/(0) = o.
.1'(0) = O.
Solution.
From (1) and (2) in Sec. 6.2 and (2) and l4) in this section we obtain the subsidiary equation
s2y -t 3sY + 2Y
=
-
1
s
e- 2S ).
(e- s
1
Yes) =
Solution
2
s(s
+ 3s + 2)
(e- s - e- 2S ).
Using the notation F(,,) and partial fractions, we obtain
F(s) =
1
s(s2
1
+ 3s +
2)
= ----s(s
+
1)(s
+
2)
112
s
--+
s+1
112
s+2
From Table 6.1 in Sec. 6.1. we see that the inverse is
Therefore, by Theorem 1 in Sec. 6.3 (t-shifting) we obtain the square-wave response shown in Fig. 131,
y = :g-\F(s)e- s - F(s)e- 2s )
= f(t -
I)u(t - I) - f(t - 2)u(t - 2)
o
=
~
_ e-Ct-1)
+ ~e-2Ct-l)
{ -e -Ct-l) + e -Ct-2) + "2e
1 -2(t-1)
-
1
"2e
-2(t-2)
yet)
0.5
2
3
4
Fig. 131. Square wave and response in Example 1
(0
< t<
J)
(1
< t<
2)
(t> 2). •
244
E X AMP L E 2
CHAP.6
laplace Transforms
Hammerblow Response of a Mass-Spring System
Find the response of the system in Example I with the square wave replaced by a unit impulse at time
t = L
Solutioll.
We now have the ODE and the subsidiary equation
y" + 3/ + 2y
I).
= S(t -
and
(S2
+
3s
+
2)Y = e- s .
Solving algebraically gives
( 1
e -s
Yes) =
(s
+
+
1)(5
I) e .
~-s+2
2)
-s
By Theorem 1 the inverse is
y(t) = ~C\Y) =
ifO<t<1
0
{
e
-(t-l)
- e
-2(t-ll
if
t> 1.
y(t) is shown in Fig. 132. Can you imagine how Fig. 131 approaches Fig. 132 as the wave becomes shorter and
shorter. the area of the rectangle remaining I?
•
yet)
0.2
0.1
,3
Fig. 132.
E X AMP L E 3
5
Response to a hammerblow in Example 2
Four-Terminal RLe-Network
n.
Find the output voltage response in Fig. 133 if R = 20
L = I H, C = 10- 4 F, the input is S(t) (a umt impulse
at time t = 0). and current and charge are zero at time t = O.
Solutioll.
To understand what is going on, note that the network is an RLC-cucuit to which two wires at A
and B are attached for recording the voltage v(r) on the capacitor. Recalling from Sec. 2.9 that current i(t) and
charge q(t) are related by i = q' = dqldt, we obtain the model
Li'
+
Ri
+ !!..
C
= Lq"
+
Rq'
+ q
C
=
q"
+ 20q' + 10000q
= Set).
From (1) and (2) in Sec. 6.2 and (5) in this section we obtain the subsidiary equation for Q(s) = '£(q)
(S2
+ 20s + 10OOO)Q
= 1.
Solution
Q=------::--(s + 10)2 + 9900
By the first shifting theorem in Sec. 6.1 we obtain from Q damped oscillations for q and v; rounding
9900 = 99.502 , we get (Fig. 133)
I
lOt
q = ~C\Q) = 99.50 esin 99.50t
and
v =
q = lOO.5e- 1ot sin 99.50t.
C
•
SEC. 6.4
245
Short Impulses. Dirac's Delta Function. Partial Fractions
8(t)
v
80
R
/\
40
L
C
0
B
A
V
-40
v(t);
-80
?
f\
O.O~
0-
\
dlj O.l!'L
Network
""
0.2
0.3
:25
Voltage on the capacitor
Fig. 133.
Network and output voltage in Example 3
More on Partial Fractions
We have seen that the solution Y of a subsidiary equation usually appears as a quotient
of polynomials Y(s) = F(s)JG(s), so that a partial fraction representation leads to a
sum of expressions whose inverses we can obtain from a table, aided by the first
shifting theorem (Sec. 6.1). These representations are sometimes called Heaviside
expansions.
An un repeated factor s - a in G(s) requires a single partial fraction AJ(s - a). See
Examples I and 2 on pp. 243, 244. Repeated real factors (s - a)2, (s - a)3, etc., require
partial fractions
etc.,
The inverses are (A2t + AI)e at , (iA 3t 2 + A2t + Al)e at , etc.
Unrepeated complex factors (s - a)(s - a), a = 0: + i{3, a = 0: - i{3, require a partial
fraction (As + B)J[(s - 0:)2 + 132]. For an application, see Example 4 in Sec. 6.3.
A further one is the following.
E X AMP L E 4
Unrepeated Complex Factors. Damped Forced Vibrations
Solve the initial value problem for a damped mass-spring system acted upon by a sinusoidal force for some
time interval (Fig. 134),
y" + 2y' + 2y = rlt),
r(t) = 10 sin 2t if 0 < t <
7T
>
and 0 if t
7T;
y(O) = I,
y' (0)
Solution.
=
~5.
From Table 6.1, (1), (2) in Sec. 6.2, and the second shifting theorem in Sec. 6.3, we obtain the
subsidiary equation
(s2y ~
S
+ 5) + 2(sY ~ I) + 2Y = 10
2
~2-- (1 ~ e- 7TS ).
s
+4
We collect the Y-terms, (s2 + 2s + 2)Y, take ~s + 5 - 2 = ~s + 3 to the right, and solve,
(6)
20
y = ----;;----;:---(s2 + 4)(s2 + 2s + 2)
s
(S2 + 4)(s2
+ 2s +
For the last fraction we get from Table 6.1 and the first Shifting theorem
(7)
';£-
I{
S+I~4}
2
+ I) + I
(s
=
t
e- (cost
~
~
3
2) + s2 + 2s + 2
4sint).
246
CHAP. 6
Laplace Transforms
In the first fraction in (6) we have unrepeated complex roots, hence a partial fraction representation
20
As
+
Ms+N
B
(S2 + 4)(s2 + 2s + 2)
Multiplication by the common denominator gives
20
=
(As + 8)(s2 + 2s + 2) + (Ms + M(s2 + 4).
We determine A, B, M, N. Equating the coefficients of each power of s on both sides gives the four equations
= A
+M
0 = 2A
+
(a)
[s3]: 0
(c)
[sl:
2B
+ 4M
(b)
[s2]:
0 = 2A
(d)
[sol:
20 = 2B
+
B
+N
+ 4N.
We can solve this, for instance, obtaining M = -A from (a), then A = B from (c), then N = - 3A from (b),
and finally A = -2 from (d). Heuce A = -2, B = -2, M = 2. N = 6. and the fust fraction in (6) has the
representation
-2s - 2
(8)
s2
+4 +
2(s
+ I) + 6 - 2
+ 1)2 + I
-2 cos 2f - sin 2f + e- t (2 cos f + 4 sin f).
Inverse transform:
(s
The ,urn of this and (7) is the ,olution of the problem for 0 <
f
< 1'. namely (the sines cancel).
y(t) = 3e- t cos f - 2 cos 2f - sin 21
(9)
if 0<1<1'.
In the second fraction in (6) taken with the minus sign we have the factor e--rrs, so that from (8) and the second
shifting theorem (Sec. 6.3) we get the inverse transform
+2 cos (2f - 21') + sin (2, - 217) - e-(t--rr) [2 cos (I - 17) + 4 sin (I - 1T)]
= 2 cos 2f + sin 21 + e -(t-.,,-) (2 cos, + 4 sin f).
The sum of this and (9) is the solution for
y(l)
(iO)
=
f
>
17,
e- t [(3
+ 2e"') cos f + 4e 7T sin I]
if'> 17.
Figure 134 shows (9) (for 0 < , < 17) and (10) (for' > 17). a beginning vibration, which goes to zero rapidly
•
because of the damping and the absence of a driving force after r = 1'.
yet)
2
-----l
y =
0 (Equilibrium
position)
y
2rr
3rr
-1
Driving force l----II'=~
Dashpot (damping)
-2
Mechanical system
Output (solution)
Fig. 134.
Example 4
The case of repeated complex factors [(s - a)(s - a)]2, which is important in connection
with resonance, will be handled by "convolution" in the next section.
SEC. 6.4
247
Short Impulses. Dirac's Delta Function. Partial Fractions
PROBLEM SH 6:;:4
11-121
Showing the details. find. graph. and discuss the
+ Y = OCt y' (0) = 0
v" + 2v' + 2v
1. y"
2.
yeO)
217),
=
=
~olution.
10.
e- t + Sfj(t - 2).
'y' (0) = I
:,,(0) = '0.
!) -
.v" -
3.
for the differential equation. involving k, take specific
k's from the beginning.
EFFECT OF DELTA FUNCTION ON
VIBRATING SYSTEMS
y = 100(t 1000U yeO) = 10.
y' (0) = 1
1).
+ 3/ + 2)' = 10(sin t + oCt - I».
= 1.
y' (0) = - I
5. y" + 4y' + Sy = [I - ult - lO)]et - elOo(t yeO) = 0,
y' (0) = I
6. -,," + 2\"' - 3y = 1000(t - 2) + 1000(t 4. y"
yeO)
I.
yeO) =
7.
y"
.
+
= I.
8. y" + S/ + 6)'
= I
y (0)
yeo) = 0,
= o(t - !17)
-,,'(0) = 0
+
9. y" + 2/ + Sy = 2St - 1000(1
yeo) = -2,
y' (0) = S
u(t -
17)
3,"
12. v"
+
y
cos t,
Y' (0)'=
-2 sin t
+
100(t - 17),
•
(.I' - a)F(s)
hm -----'--'-
s~a
G(s)
F(s)
G(s)
=
-2.
- 41' = 2e t - 8e 2 0(t - 2),
~'(0)='2 . . /(0)=0
=
=
(b) Similarly, show that for a root a of order m and
fractions In
17),
10. y" + S)' = 2St - 1000(t - 17).
yeo)
y' (0) = S. (Compare with Prob. 9.)
+
A
lOy
= 10[1 - lilt - 4)] - 100(t - S).
.
I
y(O)
11. v"
3L
15. PROJECT. Heaviside Formulas. (a) Show that for a
simple root a and fraction AI(s - a) in F(s)/G(s) we
have the Heal'iside formllia
y'(O) = 0
+
2\"'
-
10),
(b) Experiment on the response of the ODE in
Example 1 (or of another ODE of your choice) to an
impulse OCt - a) for various systematically chosen a
(> 0); choose initial conditions yeO) =1= 0, y' (0) = O.
Also consider the solution if no impulse is applied. Is
there a dependence of the response on a? On b if you
choose bo(t - a)? Would -o(t - Ii) with a > a
annihilate the effect of o(t - a)? Can you think of
other
questions
that
one
could
consider
experimentally by inspecting graphs?
Al
s-a
+ - - + further fractions
we have the Heaviside formulas for the first coefficient
yeO)
=
0,
.
(.I' - a)mF(s)
Am = hm - - - - -
I
G(s)
s-.a
13. CAS PROJECT. Effect of Damping. Consider a
vibrating system of your choice modeled by
y" + cy' + ky =
I'lt)
with r(t) involving a B-function. (a) Using graphs of
the solution. describe the etJect of continuously
decreasing the damping to O. keeping k constant.
(b) What happens if c is kept constant and k is
continuously increased, starting from O?
(c) Extend your results to a system with two
o-functions on the right. acting at different times.
14. CAS PROJECT. Limit of a Rectangular Wave.
Effects of ImpUlse.
(a) In Example I. take a rectangular wave of area 1
from 1 to I + k. Graph the responses for a sequence
of values of k approaching zero. illustrating that for
smaller and smaller" those curves approach the curve
shown in Fig. 132. Hint: If your CAS gives no solution
and for the other coefficients
Ak
=
I
(m - k)!
m k
d ---k
s~a d.~m-
lim
[(.I' -
ai"'F(s) ]
G(s)
k= 1."'.111-\.
16. TEAM PROJECT. Laplace Transform of Periodic
Functions
(a) Theorem. The Laplace trallSform of a piecewise
COlltilluOllS fUllctioll f(t) n'ith period p is
(I \)
~(f) =
I
1 _ e-Ps
JVe-stf(t) dt
(.I'
0
P
tv
>
0).
Prove this theorem. Hint: Write {o'" = I o + v + ... .
Set t = (n - I)p in the nth integral. Take out e -(n-l)p
from under the integral sign. Use the sum formula for
the geometric series.
248
CHAP. 6
Laplace Transforms
(b) Half-wave rectifier. Using (11), show that the
half-wave rectification of sin wt in Fig. 135 has the
Laplace transform
(c) Full-wave rectifier. Show that the Laplace
transform of the full-wave rectification of sin wt is
W
2
S
+
7TS
w
2
coth - .
2w
(d) Saw-tooth wave. Find the Laplace transform of
the saw-tooth wave in Fig. 137.
w
fit)
k
(A half-wave rectifier clips the negative portions of the
I
i/
curve. Afull-wave rectifier converts them to positive;
see Fig. 136.)
o
p
Fig. 137.
v~_
2rrlm
Fig. 135.
f(t)
3rrlm
Half-wave rectification
Fig. 136.
2rrlm
Saw-tooth wave
~':~
lr",~~Y~
TrIm
3p
(e) Staircase function. Find the Laplace transform of
the staircase function in Fig. 138 by noting that it is
the difference of ktlp and the function in (d).
I
o
2p
O~-----pL------2~p----~3pL------
3rrlm
Full-wave rectification
Fig. 138.
Staircase function
6.5 Convolution. Integral Equations
Convolution has to do with the multiplication of transforms. The situation is as follows.
Addition of transforms provides no problem; we know that :£(f + g) = :£(f) + :£(g).
Now multiplication of transforms occurs frequently in connection with ODEs, integral
equations, and elsewhere. Then we usually know :£(f) and :£(g) and would like to know
the function whose transform is the product :£(f):£(g). We might perhaps guess that it is
fg, but this is false. The transform of a product is generally differentfrom the product of
the transforms of the factors,
:£(fg)
=1=
:£(f):£(g)
in general.
To see this take f = et and g = 1. Then fg = et , :£(fg) = lI(s - 1), but :£(f) = lI(s - 1)
and :£(1) = lis give :£(/):£(g) = lI(s2 - s).
According to the next theorem, the correct answer is that :£(f):£(g) is the transform of
the convolution of f and g, denoted by the standard notation f * g and defined by the
integral
fof(
t
(1)
h(t)
= (f * g)(t) =
'T)g(t - 'T) d'T.
SEC 6.5
249
Convolution. Integral Equations
THEOREM 1
Convolution Theorem
If two functions f and g satisfy the assumption in the existence theorem in Sec. 6.1,
so that their transfonns F and G exist, the product H = FG is the transform of h
given by (1). (Proof after Example 2.)
E X AMP L E 1
Convolution
Let H(s)
1/[(s - a)s]. Find h(t).
=
Solution.
g(t - 7)
1/(s - a) has the inverse fft) = eat. and lis has the inverse get) = 1. With f( 7) = eM and
== I we thus obtain from (I) the answer
h(t) = eat
*
I =
I
t
eaT. I
d7 =
.!.. (eat
I).
-
a
o
To check. calculate
H(s) = :£(h)(s) =
E X AMP L E 2
(_1-
a
a
.!..)
s
s-a -
~
• --- =
s2-as
I
1
s-a
s
-- . -
= :£(eat ) :£(1).
•
Convolution
Let H(s) = 1I(s2
+
w2)2. Find h(t).
The inverse of l/(s2 t- w2 ) is (sin wt)lw. Hence from (I) and the trigonometric formula (11) in
App. 3.1 with x = ~(wt + W7) and y = ~(wt - W7) we obtam
Solution.
sin wt
h(t) = - -
w
sin wt
*-w
=
2
1
W
I
= --2
It
It
sin
W7
sin w(t -
7)d7
0
2w
[-cos wt + cos
W7] d7
0
[
-7COS
sin
W7
wt + - w -
Jt
7~O
2w2 [ -t cos wt + sinwwtJ
•
in agreement with formula 21 in the table in Sec. 6.9.
PROOF
We prove the Convolution Theorem 1. CAUTION! Note which ones are the variables
of integration! We can denote them as we want, for instance, by T and p, and write
and
We now set t
to 00. Thus
=
p
+
T,
G(s) =
where
T
is at first constant. Then p
t)Oe-sCt-'r)g(t T
G(s) =
T) dt = eST
L=o e-SPg(p) dp.
=
t -
t)Oe-stg(t T
T,
and t varies from
T) dt,
T
250
CHAP.6
Laplace Transforms
T in F and t in G vary independently. Hence we can insert the G-integral into the
F-integral. Cancellation of e - ST and eST then gives
F(s)G(s)
=
I
x
e-STf(T)eST
o
I
I
cc
=
e-stg(t - T) dulT
I
X:xl
f(T)
e-stg(t - T) dtdT.
T O T
Here we inregrate for fixed T over T from T to rye and then over T from 0 to co. This is the
blue region in Fig. 139. Under the assumption on f and g the order of integration can be
reversed (see Ref. [A5] for a proof using uniform convergence). We then integrate first
over T from 0 to t and then over t from 0 to x, that is,
F(s)G(s)
=
I
x
e- st
o
f
t
f(T)g(t - T) cIT dt
=
0
I
oc
e-sth(t) dt =
~(h) =
H(s).
0
•
This completes the proof.
r
Fig. 139. Region of integration in the
tT-plane in the proof of Theorem 1
From the definition it follows almost immediately that convolution has the properties
(commutative law)
f*g=g*f
f
* (gl + g2) =
(f
* g] + f * g2
f
* g) * v = .f * (g * v)
(distributive law)
(associative law)
f*O=O*f=O
similar to those of the multiplication of numbers. Unusual are the following two properties.
E X AMP L E 3
Unusual Properties of Convolution
f *I
"* f in general. For instance.
t
(*
(f
* j)(t)
~
I
=
f ..
I d.
o
= ~ (2
"* t.
0 may not hold. For instance. Example 2 with w = I gives
sin t
* sin t
= -~ t
cos t
+ ~ sin t
(Fig. 140).
•
SEC. 6.5
251
Convolution. Integral Equations
4
2
-2
-4
Fig. 140.
Example 3
We shall now take up the case of a complex double root (left aside in the last section in
connection with partial fractions) and find the solution (the inverse transform) directly by
convolution.
E X AMP L E 4
Repeated Complex Factors. Resonance
In an undamped mass-spring system, resonance occurs if the frequency of the driving force equals the natural
frequency of the system. Then [he model is lsee Sec. 2.8)
where "'02 = kIm, k is the spring constant. and 111 is the mass of the body attached to the spring. We assume
yeO) = and y' (0) = 0, for simplicity. Then the subsidiary equation is
°
2
s Y+
2
"'0
K",o
Y=
2
S
+
Its solution is
2 .
"'0
This is a transform as in Example 2 with", =
directly that the solution of our problem is
y(t)
=
K",o (
--2
-tcoS "'ot +
2"'0
"'0
and multiplied by K",o' Hence from Example 2 we can see
sin "'ot )
"'0
=
K
- - 2 (-"'olCOS
"'ol + sin "'ot)·
2"'0
We see that the first term grows without bound. Clearly, in the case of resonance such a tenn must occur. (See
•
also a similar kind of solution in Fig. 54 in Sec. 2.8.)
Application to Nonhomogeneous Linear ODEs
Nonhomogeneous linear ODEs can now be solved by a general method based on
convolution by which the solution is obtained in the form of an integral. To see this, recall
from Sec. 6.2 that the subsidiary equation of the ODE
y"
(2)
+ ay' + by = ret)
(a. b constant)
has the solution [(7) in Sec. 6.2]
Yes) = [(s
+ a)y(O) + y' (O)jQ(s) + R(s)Q(s)
with R(s) = :£(r) and Q(s) = 1/(S2 + as + b) the transfer function. Inversion of the first
term [... ] provides no difficulty; depending on whether 2 - b is positive, zero, or
negative, its inverse will be a linear combination of two exponential functions, or of the
!a
252
CHAP. 6
Laplace Transforms
form (Cl + c2t)e-at/2, or a damped oscillation, respectively. The interesting term is
R(s)Q(s) because ret) can have various forms of practical importance, as we shall see. If
yeo) = 0 and y' (0) = 0, then Y = RQ, and the convolution theorem gives the solution
f qV t
(3)
E X AMP L E 5
y(t)
=
T)r( T) dT.
o
Response of a Damped Vibrating System to a Single Square Wave
Using convolution. determine the response of the damped mass-spring system modeled by
y" + 3/ + 2)' = r(t),
r(t) =
1 if I < t < 2 and 0 otherwise,
yeO}
=/
(O)
= o.
This system with an input (a driving force) that acts for sOllie lillie ollly (Fig. 141) has been solved by partial
fraction reduction in Sec. 6.4 (Example I).
Solution by Convolution.
Q(s} =
s2 + 3s + 2
The transfer function and its inverse are
(s + I}(s + 2)
s + I
hence
s+2'
Hence the convolution integral (3) is (except for the limits of integration)
() I ( )
Yt
=
q t -
T'
I dT
=
I[
e
-(t-T)
- e
-2(t-T)]
d T = e-(t-T) - 2e
1 -2(t-T)
.
Now comes an important point in handling convolution. reT} = 1 if I < T < 2 only. Hence if t < I. the integral
is zero. If I < t < 2. we have to integrate from T = I (not 0) to t. This gives (with the first two terms from the
upper limit)
If t > 2, we have to integrate from
T
=
I to 2 (not to t). This gives
Figure 141 shows the input (the square wave) and the interesting output, which is zero from 0 to I. then increases,
reaches a maximum (near 2.6) after the input has become zero (why?), and finally decreases to zero in a monotone
fashion.
•
yet)
1
/output (response)
0.5
2
Fig. 141.
3
4
Square wave and response in Example 5
Integral Equations
Convolution also helps in solving certain integral equations. that is, equations in which
the unknown function yet) appears in an integral (and perhaps also outside of it). This
concerns equations with an integral of the form of a convolution. Hence these are special
and it suffices to explain the idea in terms of two examples and add a few problems in
the problem ~et.
SEC. 6.5
Convolution. Integral Equations
E X AMP L E 6
253
A Volterra Integral Equation of the Second Kind
Solve the Volterra integral equation of the second kind3
f
yet) -
t
yeT) sin (t - T) dT = t.
o
Solutioll. From (I) we see that the given equation can be written as a convolution. y - y * sin t =
Y = !ley) and applying the convolution theorem, we obtain
t.
Writing
I
s2
yes) - Yes) - 2 - - = yes) - 2 - s+1
s+l
The solution is
and
give~
t3
the answer
yet)
= t
+ (5 .
Check the result by a CAS or by substitution and repeated integration by parts (which will need patience) . •
E X AMP L E 7
Another Volterra Integral Equation of the Second Kind
Solve the Volterra integral equation
y(t) -
t
J
(1
+ T) ylt
-
T) dT = I - sinh r.
o
Solution. By (1) we can write y - (I + t) * Y = 1 - sinh t. Writing Y
convolution theorem and then taking common denominators
yeS)
[I - (.!..
S
(S2 - S -
+
~)J
s2
=
.!..s s
CONVOLUTIONS BY INTEGRATION
Find by integration:
1. 1 * 1
yet) = cosh t.
•
Y(s) •
* et
4.
5. 1
* cos wt
6. 1
8.
s2 _ I
and the solution is
13.
S2(S2
+
14.
1)
s
(S2
+
16)2
(S2
+
1)(S2
*t
2. t
3. t
7. ekt * e- kt
hence
s2 - I
we obtain by using the
l)ls cancels on both sides, so that solving for Y simply gives
Yes) =
11-81
_1_.
= !ley),
eat
* e bt (a
* f(t)
sin t * cos t
i= b)
15.
16.
S(S2 - 9)
5
+ 25)
17. (Partial fractions) Solve Probs. 9, 11, and 13 by using
partial fractions. Comment on the amount of work.
19-161
INVERSE TRANSFORMS
BY CONVOLUTION
9.
11.
1
(s - 3)(s
--=-2--
s(s
+ 4)
+ 5)
SOLVING INITIAL VALUE PROBLEMS
118-251
Find f(t) if 5£(f) equals:
USing the convolution theorem, solve:
10.
12.
18. y"
s(s -
1)
-::-2- - S
(s -
2)
19. y"
20. y"
+
+
+
y' (0)
y
=
4)'
5/
yeO)
sin t.
sin 3t,
=
+
4)'
= 2e-
= O.
yeO) =
2t
,
/(0) = 0
o.
/(0)
yeO)
= 0,
0
= 0
31f the upper limit of integration is variable, the equation is named after the Italian mathematician VITO
VOLTERRA (1860-1940), and if that limit is collsta1l1, the equation is named after the Swedish mathematician
IVAR FREDHOLM (1866-1927). "Of the second kind (fust kind)" indicates that y occurs (does not occur)
outside of the integral.
CHAP.6
254
Laplace Transforms
+
9y = 8 sin t if 0 < t < IT and 0 if t > 7.;
y'(O) = 4
22. y" + 3y' + 2y = 1 if 0 < t < a and 0 if t > a;
yeO) = 0, y' (0) = 0
21. y"
yeO) = 0,
23. y"
+ 4)' =
5u(t - I);
y(O)
24. y" + 5/ + 6y = 8(t y'(O) = 0
25. y"
+
6/
y(o) = L
+
3);
INTEGRAL EQUATIONS
Using Laplace transforms and showing the details, solve:
t
27. y(t) -
= 0, y' (0) = 0
yeo) = 1,
Jo
Y( T) (IT = I
f
f
t
28. y(t)
+
cosh (t - T) dT = t
y( T)
+
e'
o
8y = 28(t /(0)
127-341
I)
+
28(t -
2);
29.
= 0
y(T) -
t
sin (t - T) dT = cos t
y( T)
o
t
26. TEAM PROJECT. Properties of Convolution.
Prove:
(a) Commutativity. f * g = g * f
* g) 7- v = f * (g * v)
Distrihutivity, f * (gl + g2) = f * gl + f * g2
30. y(t)
2
Y(T)
cos
(t -
T)
d. = cos t
t
31. y(t)
(b) Associativity, (f
(c)
(d) Dirac's delta. Derive the sifting formula (4) in
Sec. 6.-1- by using h· with a = 0 [(1), Sec. 6.4] and
applying the mean value theorem for integrals.
Jo
+ J(t o
Jo
+
Jo
+
T)Y(T)
dT
=
1
t
32. y(t) -
y( T)(t -
T)
dT = 2 -
4t 2
t
33. y(t)
2e t
e-TY(T) ciT
=
te'
(e) Unspecified drhing force. Show that forced
vibrations governed by
),'(0) =
K2
with w =1= 0 and an unspecified driving force r(t) can
be written in convolution form.
I
Y = - sin wt
w
6.6
* ret) +
Kl cos wt
K2
+ -
w
sin wt.
35. CAS EXPERIJ\iIENT. Variation of a Parameter.
(a) Replace 2 in Prob. 33 by a parameter k and
investigate graphically how the solution curve changes
if you vary k, in particular near k = - 2.
(b) Make similar experiments with an integral
equation of your choice whose solution is oscillating.
Differentiation and Integration of Transforms.
ODEs with Variable Coefficients
The variety of methods for obtaining transforms and inverse transforms and their
application in solving ODEs is surprisingly large. We have seen that they include direct
integration. the use oflinearity (Sec. 6.1), shifting (Secs. 6.1, 6.3), convolution (Sec. 6.5).
and differentiation and integration of functions f(t) (Sec. 6.2). But this is not all. In this
section we shall consider operations of somewhat lesser importance. namely.
differentiation and integration of transforms F(s) and corresponding operations for
functions f(t), with applications to ODEs with variable coefficients.
Differentiation of Transforms
It can be shown that if a function f(t) satisfies the conditions of the existence theorem in
Sec. 6.1, then the derivative F' (s) = dF/ds of the transform F(s) = ::£(f) can be obtained
by differentiating F(s) under the integral sign with respect to s (proof in Ref. LGR4] listed
in App. 1). Thus, if
F(s) =
{Oo e-stf(t) dt,
then
F'(s)
= - fOe-sttf(t) dt.
o
SEC. 6.6
255
Differentiation and Integration of Transforms. ODEs with Variable Coefficients
Consequently, if SE(f)
=
F(s), then
(1)
=
-F'(s),
SE{tf(t)}
SE-I{F'(s)}
hence
=
-rf(t)
where the second formula is obtained by applying SE- I on both sides of the first formula.
In this way, differentiation of the transform of a function corresponds to the multiplication
of the function by -to
E X AMP L E 1
Differentiation of Transforms. Formulas 21-23 in Sec. 6.9
We shall derive the following three formulas.
f(t)
SE(f)
1
(2)
(S2
+
,I
(S2
+
I
(S2
+
1
--3
(32)2
2f3
s
(3)
(32)2
t
sin f3t
2f3
(32)2
(sin f3t
2f3
S2
(4)
(sin f3t - f3t cos f3t)
1
+ f3t cos f3t)
Solutioll.
From (I) and formula 8 (with w = (3) in Table 6.1 of Sec. 6.1 we obtain by differentiation
(CAUTION! Chain rule')
Dividing by 2f3 and using the linearity of 5£. we obtain (3).
Formulas (2) and (4) are obtained as follows. From (I) and formula 7 (with
w =
(3) in Table 6.1 we find
(S2 +~) - 2s2
(5)
f(t co~ f3t) = -
From this and formula 8 (with
w =
2
(s
r:2 2
+ p )
(3) in Table 6.1 we have
(
I)
5£ t cos f3t ::':: - sin f3t
f3
s2_f32
=
(s
2
2 2 ::':: _,,2
+ f3 ) ,
+ f32
On the right we now take the common denominator. Then we see that for the plus sign the numerator becomes
s2 - ~ + s2 + f32 = 2.. 2 , so that (4) follows by division by 2. Similarly. for the minus sign the numerator
takes the form s2 - f32 - s2 - ~ = -2~. and we obtain (2). This agrees with Example 2 in Sec. 6.5. •
Integration of Transforms
Similarly, if f(t) satisfies the conditions of the existence theorem in Sec. 6.1 and the limit
of f(t)/t, as t approaches 0 from the tight, exists, then for s > k,
(6)
5£{ f;t) } = ["F('S) df
hence
In this way, illfegration of the tmllsform of a function f(t) corresponds
f(t) by t.
10
the division of
256
CHAP. 6
laplace Transforms
We indicate how (6) is obtained. From the definition it follows that
and it can be shown (see Ref. [GR4] in App. 1) that under the above a:.:.umptions we may
reverse the order of integration, that is,
Integration of e-st with respect to
equals e-st/t. Therefore,
s gives e-st/( -t)o Here the integral over s on the right
to
fO
5£{ J(t) }
F(s) d'S =
e- st J(t) dr =
s o t
E X AMP l E 2
(s> k).
t
•
Differentiation and Integration of Transforms
W2)
(I + 7
Find the inverse transform of In
Solution.
=
In
s2 + w2
--S-2-
Denote the given transfonn by F(s). Its derivative is
2
2
2)
2s
d (
In (s + w ) - In s
= -2--2
s
s + w
,
F (s) = -d
-
2s
"2
.
s
Taking the inverse transform and using (I), we obtain
;e-
II F '}
(5) =;e- I{ - 22s
--2
s + w
Hence the inverse fell of H,I') is fO)
Alternatively, if we let
G(s)
2s
=
-
s + w
2} =
-
s
2 cos wt - 2 = -tf(t).
2(1 - cos wt)/t. This agrees with formula 42 in Sec, 6.9.
=
2
-2---2 -
-
s
g(t) = ;e-1(G) = 2(cos wr -
then
•
1).
From this and (6) we get. in agreement with the answer just obtained,
In
s
2
+w
2
J
00
glt)
G(s) ds = - s s t
--2- =
=
2
-
(1 - cos wt).
t
the minus occurring since s is the lower limit of integration.
In a similar way we obtain formula 43 in Sec. 6.9,
;e-1 fin
(I - ::)}
=
~ (I -
cosh at).
•
Special Linear ODEs with Variable Coefficients
Formula (I) can be used to solve certain ODEs with variable coefficients. The idea is this,
Let 5£(y) = Y. Then 5£(/) = sY - yeO) (see Sec. 6.2). Hence by (I),
(7)
,
d
5£(ty ) = - -
ds
[sY - yeO)]
dY
-Y- s - .
ds
SEC. 6.6
257
Differentiation and Integration of Transforms. ODEs with Variable Coefficients
Similarly, :iCy") = s2y - sy(O) - y' (0) and by (1)
(8)
dY
:i(ty"
) =d
- [
- 2
s Y - sy(O) - y ' (0) ] = -2sY - s 2 ~
~
+
y(O).
Hence if an ODE has coefficients such as at + b, the subsidiary equation is a first-order ODE
for Y, which is sometimes simpler than the given second-order ODE. But if the latter has
coefficients at2 + bt + c, then two applications of (1) would give a second-order ODE for
Y, and this shows that the present method works well only for rather special ODEs with variable
coefficients. An important ODE for which the method is advantageous is the following.
E X AMP L E 3
Laguerre's Equation. Laguerre Polynomials
Laguerre's ODE is
+ (I
ty"
(9)
We determine a ,olution of (9) with
[
11
= O.
t)y' + ny
-
=
O.
I. 2..... From (7)-(9) we get the subsidiary equation
dY
. 2sY_s2dY +y(O)] +sY-y(O)- (_Y_S
) +IIY=O.
ds
ds
Simplification gives
2 dY
(s - s ) ds +
+ I - slY = O.
(n
Separating variables, using partial fractions, integrating (with the constant of integration taken zero), and taking
exponentials. we get
(10*)
dY
Y
=_
II + I s -
s ds
(_'_1__
~) ds
I
=
s2
S -
and
s
We write In = Y-\Y) and prove Rodrigues's formula
10 = I,
(10)
II
=
1,2,···.
These are polynomials because the exponential terms cancel if we perform the indicated differentiations. They
are called Laguerre polynomials and are usually denoted by Ln (see Problem Set 5.7, but we continue to reserve
capital letters for transforms). We prove (10). By Table 6.1 and the first shifting theorem (s-shifting),
hence by (3) in Sec. 6.2
because the derivatives up to the order II
(10) and then (10*)]
-
I are zero at O. Now make another shift and divide by n! to get [see
;£(1,,)
11-121
TRANSFORMS BY DIFFERENTIATION
Showing the details of your work. find 5£(f) if f(t) equals:
1. 4te t
3. t sin wt
2. - t cosh 2t
4. t cos (t
+
=
s
=
5. te- 2t ~in t
7. t 2 sinh 4t
2
sin wt
11. t sin (t + k)
9. t
k)
(s _ I)n
~
•
Y.
6.
8.
10.
t 2 sin 3t
tne kt
t cos WI
12. te -kt sin I
258
CHAP.6
Laplace Transforms
I
13-20
INVERSE TRANSFORMS
Using differentiation, integration. s-shifting. or convolution
(and showing the details). find f(t) if 5£(f) equals:
6
s
13.
14.
(s
15.
17.
+
1)2
2(s
[(s
+
(S2
2)
+ 2)2 +
16.
1]2
16)2
s
(S2 _
1)2
and calculate 10 ,
(c) Calculate 10 , '
II=I-tby
s+a
18. I n - s+b
2
(s - k)3
s
s
19. I n - S -
+
(b) Show that
20. arccot -
1
w
21. WRITING
PROJECT.
Differentiation
and
Integration of Functions and Transfonns. Make a
shOlt draft of these four operations from memory. Then
compare your notes with the text and write a report of
2-3 pages on these operations and their significance in
applications.
22. CAS PROJECT. Laguerre Polynomials. (a) Write a
CAS program for finding In(t) in explicit form from
(10). Apply it to calculate 10 , ••• , /10 , Verify that 10 ,
• • . ,110 satisfy Laguerre's differential equation (9).
(11
+
••• ,
•••
110 from this formula.
110 recursively from 10 = 1,
1)ln+l = (2/l
+
I - t)ln - /lIn_I'
(d) Experiment with the graphs of 10 , •••• 110 , finding
out empirically how the first maximum. first minimum.
... is moving with respect to its location as a function
of II. Write a short report on this.
(e) A generating function (definition in Problem Set
5.3) for the Laguerre polynomials is
L
I,,(t)x n = (1 -
X)-IetX/(X-ll.
n=O
Obtain 10 , • • . ,110 trom the corresponding partial sum
of this power series in x and compare the In with those
in (a), (b), or (e) .
f,.7 Systems of ODEs
The Laplace transform method may also be used for solving systems of ODEs. as we shall
explain in terms of typical applications. We consider a first-order linear system with
constant coefficients (as discussed in Sec. 4.1)
(1)
Writing Y 1 = ~(Yl)' Yz = ':£(yz)· G 1
Sec. 6.2 the suhsidiary system
= ':£(gl)' Gz = .:£(gz), we obtain from (I) in
By collecting the Y r and Yz-tenns we have
(2)
By solving this system algebraically for Y1 (s), Yz(s) and taking the inverse transform we
obtain the solution h = ~-l(Yl)' yz = ~-l(yZ) of the given system (I).
SEC. 6.7
259
Systems of ODEs
Note that (1) and (2) may be written in vector form (and similarly for the systems in
the examples); thus, setting y = [h YZ]T, A = [ajk], g = [gl g2]T, Y = [Y1 Yz]T,
G = [G 1 GZ]T we have
y' = Ay
E X AMP L E 1
+g
(A - sI)Y
and
= -yeO) - G.
Mixing Problem Involving Two Tanks
Tank Tl in Fig. 142 contains initially 100 gal of pure water. Tank T2 contains initially 100 gal of water in which
150 Ib of salt are dissolved. The inflow into Tl is 2 gal/min from T2 and 6 gal/min containing 6 Ib of salt trom
the outside. The inflow into T2 is 8 gal/min from T 1. The outflo\>; from T2 is 2 + 6 = 8 gal/min. as shown in
the figure. The mixtures are kept unifonn by stirring. Find and plot the salt contents )"1(1) and Y2(t) in Tl and
T2 , respectively.
Solutioll.
The model is obtained in the form of two equations
Time rate of change = Inflow/min - Outflow/min
for the two tanks (see Sec. 4.1). Thus,
,
)'1
8
= - Wo
,
2
Y1
+ 100)'2 + 6,
8
)'2 =
8
Wo
."1 -
Wo ."2'
The initial conditions are )'1(0) = 0, )'2(0) = 150. From this we see that the subsidiary system (2) is
6
(-0.08 - s)Y1 +
+
s
(-0.08 - S)Y2 = -150.
We solve this algebraically for YI and Y2 by elimination (or by Cramer's rule in Sec. 7.7), and we write the
solutions in terms of partial tractions,
9s + 0.48
YI = - - - - - - - s(s
Y2 =
+
0.12)(s
150s2
s(s
+
+
12s
0.12)(s
100
+ 0.(4)
+
0.48
+ 0.04)
62.5
s
s
+ 0.12
37.5
s + 0.04
100
125
75
s
s + 0.12
s + 0.04
--+
By taking the inverse transfonn \>;e arrive at the solution
)'1 =
100 - 62.5e- O. 12t
-
37.5e-O.04t
)'2 = 100 + 125e- O. 12t - 75e- o.04t .
Figure 142 shows the interesting plot of these functions. Can you give physical explanations for their main
features? Why do they have the limit 100? Why is ."2 not monotone, whereas Yl is? Why is )'1 from some time
on suddenly larger than Y2? Etc.
•
y(tl
2 gal/min
5u
Fig. 142.
l/min
50
Mixing problem in Example I
100
150
200
260
CHAP.6
laplace Transforms
Other systems of ODEs of practical importance can be solved by the Laplace transform
method in a similar way, and eigenvalues and eigenvectors as we had to determine them
in Chap. 4 will come out automatically, as we have seen in Example 1.
E X AMP l E 2
Electrical Network
Find the currents ;1(1) and ;2(1) in the network in Fig. 143 with Land R measured in terms of the usual units
(see Sec. 2.9). U(I) = 100 volts if 0 ~ I ~ 0.5 sec and 0 thereafter, and ;(0) = 0, ;' (0) = o.
L J =0.8H
0.5
1.5
2
2.5
3
Currents
Network
Fig. 143.
Electrical network in Example 2
Solution.
The model of the network is obtained from Kirchhoff's voltage law as in Sec. 2.9. For the lower
circuit we obtain
0.8;~ + l(i} - ;2) + 1.4i} = 100[1 - U(I -
i)l
and for the upper
O.
=
Division by 0.8 and ordering gives for the lower circuit
;~ + 3;1 - 1.25;2 = 125[1 -
i)l
11(1 -
and for the upper
With i}(O) = O. ;2(0) = 0 we obtain from (I) in Sec. 6.2 and the second shifting theorem the subsidiary system
(s
1_
2
+ 3)/
1.25/
= 125
(+ _ -:/2)
e
-I} + (s + 1)12 = O.
Solving algebraically for I} and 12 gives
I} =
12
=
125(s + I)
s(s
+ !)(s + ~)
s(s
+ i)(s + ~)
125
e
(1 -
-s/2
),
(l - e -s/2).
The right sides without the factor I - e -S/2 have the partial fraction expansions
500
h
and
125
-
3(s +
625
i) -
21(s
+ ~)
SEC. 6.7
261
Systems of ODEs
500
250
250
7s
3(s
+ !)
+ ------,,21(s + ~)
respectively. The inverse transform of this gives the solution for 0 ~ t ~
;1(t)
= - --
125
3
e- t12
;2(t)
= - --
250
3
e- tJ2
625
--e
-7t/2
21
+
!,
500
7
(0 ~ t ~
According to the second shifting theorem the solution for t >
I is ;1(t) -
125
3
;1(t -
!) and ;2(t)
- ;2(t
-I), that is,
625
21
;1(t) = - - - (I - e 1l4 )e- tl2 -
- - (I - e7/4)e-7t12
(t
250
i).
500
-7t/2
+ -250
-e
+ -21
7
;2(t) = - -3- (1 - e 1l4 )e-t/2
250
+ ~
>
i)
(I - e7/4 )e-7t/2
Can you explain physically why both currents eventually go to zero, and why i 1(t) has a sharp cusp whereas
;2(t) has a continuous tangent direction at t =
I?
•
Systems of ODEs of higher order can be solved by the Laplace transform method in a
similar fashion. As an important application, typical of many similar mechanical systems,
we consider coupled vibrating masses on springs.
l'
o~
mj
~,
Yj
Q
°l
1,
Y2
Fig. 144.
E X AMP L E 3
=I
m
2
=1
Example 3
Model of Two Masses on Springs (Fig. 144)
The mechanical system in Fig. 144 consists of two bodies of mass I on three springs of the same spring constant
k and of negligibly small masses of the springs. Also damping is assumed to be practically zero. Then the model
of the physical system is the system of ODEs
Y~ = -kYI
+ k(Y2
- Yl)
(3)
Y;
=
-k(Y2 - Yl) - kY2·
Here Yl and Y2 are the displacements of the bodies from their positions of static equilibrium. These ODEs follow
from Newton's second law, Mass X Acceleration = Force, as in Sec. 2.4 for a single body. We again regard
downward forces as positive and upward as negative. On the upper body, -kyl is the force of the upper spring
and k(Y2 - Yl) that of the middle spring, Y2 - Yl being the net change in spring length-think this over before
going on. On the lower body, -k(Y2 - Yl) is the force of the middle spring and -ky2 that of the lower spring.
262
CHAP. 6
Laplace Transforms
We shall determine the solution corresponding to the initial conditions )"1(0) = I, Y2(0) = I. y~(O) = V3k,
y~(O) = -V3k. Let Y 1 = ~(Yl) and Y2 = ~(Y2)' Then from (2) in Sec. 6.2 and the initial conditions we obtain
the subsidiary system
s2Yl - s - V3k = -kYl
s2Y2 - s
+ "\'3i:
+ k(Y2
= -k(Y2 -
Y1 )
- Y1)
kY2.
-
This system of linear algebraic equations in the unknowns Y1 and Y2 may be written
(S2
+ 2k)Yl -kYI
kY2
+ (,,2 +
S
=
+ V3i:
2k)Y2 = s - V3k.
Elimination (or Cramer's rule in Sec. 7.7) yields the solution, which we can expand in terms of partial fractions.
(s
+ V3k)(S2 + 2k) + k(s - V3k)
(s2
(S2
+ 2k)2
s
- k2
+ k + s2 + 3k
S2
+ 2k)(s - V3k) + k(s + V3k)
(s2 + 2k)2 - k 2
s
Hence the solution of our initial value problem is (Fig. 145)
Yl(t) = ~-l(Yl) = cos Ykt
+ sin V3kt
Y2(t) = ~-I(Y2) = cos Ykt - sin V3kt.
We see that the motion of each ma,s is harmonic (the system is undamped !), being the superposition of a "slow"
oscillation and a "rapid" oscillation.
•
0,
"1jJ
2,
-2
Fig. 145. Solutions in Example 3
..-=..
lE5-H 6 7
~
11-201
SYSTEMS OF ODES
Usmg the Laplace transform and showing the details of
YOUT work, solve the initial value problem:
1. )'~
= -)'1 -
Y1(0)
=
0,
)'2,
,
)'2 = )'1 -
)'2,
Y2(0) = I
2. Y~ = 5Yl + Y2' y~ = Y1 + 5Y2'
Y2(0) = -3
Yl(O) = 1,
3. Y~
= -6)'1
+
4Y2,
)'1(0) = -2,
4. y~ + Y2
=
)"1(0) = 1,
0,
Y~ = -4Yl
)'2(0) = -7
4V2,
Yl + y~ = 2 cos T,
h(O) = 0
5. y~ = -4Y1 - 2Y2
Yl(O) = 5.75,
+
t, Y; = 3\'1 +
Y2(0) = -6.75
6. y~ = 4Y2 - 8 cos 4t,
)'1(0) = 0,
+
y;
= -3Yl -
Y2(0) = 3
)'2 -
t,
9 sin 4t,
SEC 6.7
7. y~
263
Systems of ODEs
5)'1 -
=
17t 2 -
)'~ = lOy! - 7Y2 -
Yl (0)
= 2,
8. y~ = 6Yl
)'1(0)
=
Y2(0)
+
FURTHER APPLICATIONS
9t 2 + 2t,
4)'2 -
2t,
22. (Forced vibrations of two masses) Solve the model in
Example 3 with k = 4 and initial conditions Yl(O) = 1,
)'~(O) = 1, Y2(0) = 1, y~(O) = -1 under the assumption
that the force II sin t is acting on the first body and the
force -11 sin t on the second. Graph the two curves on
common axes and explain the motion physically.
= 0
y~ = 9Yl + 6)'2'
Y2(0) = -3
Y2,
-3,
9. y~
= 5Yl + 5)'2 - 15 cost + 27 sint,
y~ = -lOVI - 5Y2 - 150 sin t.
Yl(O) = 2,
Y2(0) = 2
10. y~ = . 2Yl + 3Y2, y~ = 4)'1
Yl(O) = 4,
Y2(0) = 3
11. y~ = Y2
y~ =
)'2(0)
+
-)'1
=
1 - u(t -
+ I -
13. y~ = Yl
+
=
0,
14. y~
=
Y~
= -3)'1
-4Yl
=
16. )":
= 0,
2)e 4t ,
)'2(0)
Y;
-)'2,
Yl(O)
=
YI(O) = 0,
Y; =
y~(O)
=
1)e t ,
t
l)e ,
Y; = 2.\'1 - 5Y2,
0, Y2(0) = 3, y;(O)
17. y~ = 4Yl + 8)'2, Y; = 5Yl + .1'2'
Yl(O) = 8,
)'~(O) = -18,
),;(0) = -21
18. y~
+)'2 =
-101 sin lOt,
Yl(O) = 0,
y~(O) = -6
19. y~ + y~
y~
+ y~
=
=
Y1(0) = 0,
20. 4y~
2.\'2,
= 1
+ Y2 + u(t + 2Y2 + u(t Y2(0) = 3
= 1,
+
)'1
I),
-Yl + 2[1 - u(t - 217)] cos t,
Y2(0) = 0
= -2YI + 2)'2,
.\'1(0)
I),
1),
y~ = 4Yl + 2Y2 + 64tu(t Y2(0) = 0
6u(t -
Yl(O) = 1,
15. y~
Y2,
-
0
12. y~ = 2Yl + Y2,
Yl(O) = 2,
.\'1(0)
u(t -
23. CAS Experiment. Effect of Initial Conditions. In
Prob. 22, vary the initial conditions systematically,
describe and explain the graphs physically. The great
variety of curves will surprise you. Are they always
periodic? Can you find empirical laws for the
changes in terms of continuous changes of those
conditions?
Y;
Y~(O) = 6,
)'2(0) = 1,
0
Y2(0) = 5,
2 sinh t,
=
2H
=
4H
Network
."3(0) = 1
-2y~ + y~
+
25. (Electrical network) Using Laplace transforms, find
the currents i 1 (t) and i2 (t) in Fig. 146, where
u(t) = 390 cos l and i 1(0) = 0, i 2(0) = O. How
soon will the currents practically reach their steady
state?
Yl = 101 sin lOt,
Y2(0) = 8,
y~ + Y~
2e t + e- t ,
et
y~ - 2y~ = O.
2y~ - 4y~ = -I6t
y](O) = 2,
Y2(O)
+
=
24. (Mixing problem) What will happen in Example I if
you double all flows (in particular, an increase to
12 gal/min containing 12 Ib of salt from the outside),
leaving the size of the tanks and the initial conditions
as before? First guess, then calculate. Can you relate
the new solution to the old one?
i(t)
l.
40
= 0,
Y3(0)
= 0
20
21. TEAM PROJECT. Comparison of Methods for
Linear Systems of ODEs.
-20
(a) Models. Solve the models in Examples 1 and 2 of
Sec. 4.1 by Laplace transfonns and compare the
amount of work with that in Sec. 4.1. (Show the details
of your work.)
-40
(b) Homogeneous Systems. Solve the systems (8),
(11)-(13) in Sec. 4.3 by Laplace transfonns. (Show the
details.)
(c) Nonhomogeneous System. Solve the syslem (3)
in Sec. 4.6 by Laplace transfonns. (Show the details.)
Currents
Electrical network and
currents in Problem 25
Fig. 146.
26. (Single cosine wave) Solve Prob. 25 when the EMF
(electromotive force) is acting from 0 to 217 only. Can
you do this just by looking at Prob. 25, praclically
without calculation?
264
6.8
CHAP. 6
Laplace Transforms
Laplace Transform: General Formulas
Fonnula
F(s) =
~(f(t)} =
Name, Comments
x
L
e-stf(t) dl
Definition of Transfonn
0
6.1
f(t) = ~-lIF(s)}
~(af(t)
+ hg(t)}
= a~{f(t)}
Inverse Transform
+ h~(g(t)}
= F(s - a)
~{eO'tf(t)}
:;P-l{F(s - a)}
Sec.
= eatf(t)
Linearity
6.1
s-Shifting
(First Shifting Theorem)
6.1
;£(f') = s~(f) - f(O)
~(f")
=
s2~(f) - sf(O) -
'f(t n )
=
sn~(f)
t' (0)
- s<n-l)f(O) - ...
Differentiation
of Function
6.2
... - tn-0(O)
~ {{f(T) dT} = ~ ~(f)
* g)tt) =
(f
I
I
Integration of Function
t
f( T)g(t - T) dT
0
=
t
f(t - T)R(T) dT
Convolution
6.5
t-Shifting
(Second Shifting Theorem)
6.3
0
~(f
~(f(t
* g)
= ~(f)~(g)
- a) u(t - a)} = e-asF(s)
~-l(e-asF(s)} =
f(t - a) u(t -
(I)
'f(tf(t)} = -F'(s)
~ { f~t) }
~(f) =
=
1 _ 1e-Ps
(0
I
6.6
F{S) di'
r
e-stf(t) dt
0
Differentiation of Transform
Integration of Transform
f Periodic with Period p
6.4
Project
16
265
SEC. 6.9
Table of Laplace Transforms
6.9
Table of Laplace Transforms
For more extensive tables, see Ref. [A9] in Appendix 1.
F(s) = ~(f(t)}
1
lis
2
lIs2
3
lIs n
4
1/V:;:
5
lIs3/2
(n = 1,2, ... )
8
9
10
11
12
Sec.
t
tn-l/(n - 1)!
6.1
lIYm
2vthr
6
7
f(t)
ta-l/f(a)
(a> 0)
-
s - a
(s - a)2
6.1
1
(s - a)n
1
(n=I,2,···)
(k> 0)
(s - a)(s - b)
(a
* b)
(a
* b)
s
(s - a)(s - b)
1
- - - (ae at -
(a - b)
bebt )
--+------------~-------------+----
13
14
1
s
sin wt
cos wt
1
- sinh at
a
s
17
18
1
w
15
16
-
6.1
cosh al
-
1
eat
w
s-a
eat
sin
wI
cos wt
1
19
2
w
20
3
21
--3
I
w
(l - cos wt)
(wt - sin wt)
1
2w
(sin wt - wt cos wt)
6.6
( continued)
266
CHAP. 6
Laplace Transforms
Table of Laplace Transforms (continued)
F(s) = !f{f(t)}
22
W 2)2
t
sinwt
2w
1
(sin wt
2w
s
+
(S2
S2
23
24
25
26
27
28
29
30
(S2
+
W 2)2
(S2
+
£l2)(S2
s
+
b 2)
(a 2 =t- b 2)
1
2
b - a
1
1
S4
+ 4k4
S4
+
Sec.
J(t)
-3
4k
2
} 66
+
wt cos wt)
(cos at - cos bt)
(sin kt cos kt - cos kt sinh kt)
1
2k2 sin kt sinh kt
s
4k4
1
1
-3
s4 - k4
2k
s
1
-2
S4 - k4
2k
~-Vs=b
1
~~
(sinh kt - sin kt)
(cosh kt - cos kt)
1 _ (e bt _ eat)
__
2v:;;(i
- b l)
e-(a+b)t/21o (a
-2
5.6
Jo(at)
5.5
1
31
+
VS2
a2
I
32
s
(s - a)3/2
33
1
(S2 _ a 2)k
v:;;t
eat(l
V;
r(k)
( -t-
7ft
(k> 0)
2a
+
r-
2at)
1 2
/
1
at
k-1/2( )
5.6
e-as/s
u(t - a)
e- as
ti(t - a)
6.3
6.4
36
-1 e -k/s
fo(2Vkt)
5.5
37
-1e -kls
38
_
39
e- kVs
40
1
-Ins
S
34
35
s
1
v:;;t
7ft
Vs
1
~/2
cos 2Vkt
1
e kls
v:;;:k sinh 2Vkt
7fk
(k> 0)
- -k - e _k2/4t
2v:;;(i
-int-I'
(1'
= 0.5772)
5.6
(continued)
267
Chapter 6 Review Questions and Problems
Table of Laplace Transforms (continued)
F(s) = ~ (f(t)}
f(t)
1
.\'-a
41
In--
42
In
43
In
44
w
arctan -
_ (e bt _ eat)
s-h
.1'2
+
t
w2
2
-(1 - cos wt)
.\'2
2
-(1 - cosh at)
.\'2
t
I
- sin wt
t
.I'
45
-
I
6.6
t
a2
.1'2 -
Sec
arccot s
App.
A3.1
Si(t)
.I'
======-
= : .:.'= -= === :_:=. -£¥IE:--w.::::Q--u::E-"S T ION 5
1. What do we mean by operational calculus?
2.
What are the steps needed in solving an ODE by Laplace
transform? What is the subsidiary equation?
3. The Laplace transform is a linear operation. What does
this mean? Why is it important?
AND PRO B L EMS
13. sin 2 t
15. tu(t - '17)
17. e t * cos 2t
19. sin t + sinh t
21. eat _ e bt (a
14. cos 2 4t
2'17) sin t
16. l/(t -
"*
h)
18. (sin wt) * (cos wt)
20. cosh 1 - cos t
22. cosh 2t - cosh t
4. For what problems is the Laplace transform preferable
over the usual method? Explain.
INVERSE LAPLACE TRANSFORMS
5. What are the unit step and Dirac's delta functions"? Give
examples.
Find the inverse transform (showing the details of your work
and indicating the method or formula used):
6. What is the difference between the two shifting
theorems? When do they apply?
23.
10.1'
+2
.1'2
+ 4.\' + 20
7. Is .'£[f(t)g(t») = .'£ [f(t)}.'£{g(t))? Explain.
8. Can a discontinuous function have a Laplace transform?
Does every continuous function have a Laplace
transform? Give reasons.
9. State the transforms of a few simple functions from
memory.
25.
29.
111-221
31.
Find the transform (showing the details of your work and
indicating the method or formula you are Using):
11. te 3t
12. e- t sin 2t
12
5.1' + 4
- e- 2s
27. - .1'2
10. If two different cuntinuous functions have transforms,
the latter are different. Why is this practically important?
LAPLACE TRANSFORMS
24.
.1'2
33.
(S2
2s + 4
+ 4.1' + 5)2
(.~
+2
.1'3
) e-s
26.
15
3.1'
.1'2-2.1'+2
2.1' - 10
28.
.1'3
32.
(.1'2
180
+
w2 )
e- 5s
16
.1'2 -
30.
+
16)2
+
18.1'2
+
.1'7
2
IT
.1'2(.1'2
4
.1'2 -
34.
2s2
+ 2.1' +
I
3.1'4
268
CHAP.6
135-501
Laplace Transforms
o< t <
SINGLE ODEs AND SYSTEMS OF ODEs
Solve by Laplace transforms. showing the details and
graphing the solution:
35. y" + Y = uCt
y' (0) = 20
-
yeO) =
1).
36. y" + L6y = 48(r - 'iT),
y'(O) = 0
37. y" + 4)" = 88(r - 5),
y' (0)
38. y"
+
'iT,
v(t) = 0 if t
'L
yeO) = -1,
/(0) = 0
39. y" + 2y'
+
vet)
yeO) = 10,
2).
\'(0) = 0,
lOy = 0,
+ 4/ +
y'(o) = -5
LC-circuit
53. (RLC-circuit) Find and graph the current i(t) in the
RLC-circuit in Fig. 149, where R = 1 n. L = 0.25 H,
C = 0.2 F, v(t) = 377 sin 20t V, and CUlTent and charge
yeo) = 7,
/ (0) = -1
40. y"
L
Fig. 148.
Y = 1I(t -
and CUlTent and charge at
'iT,
o.
-1
=
>
t = 0 are O.
at t = 0 are O.
yeo) = 5.
5y = SOt.
41. y" - y' - 2y = 12u(t - 'iT) sin t.
y(O) = 1,
y' (0) = -1
42. y" - 2/
yeO) =
+
y = t8(t - 1),
o.
),'(0)
= 0
43. y" - 4/ + 4y = 8(t - 1) - 8(r - 2).
yeO) = o.
y' (0) = 0
44. y" + 4y = 8(t - iT) - 8(t - 2 'iT),
yeO) = I,
/(0) = 0
45. Y~ + Y2 = sin t, y~ + Yl = -sint,
vet)
Fig. 149.
54. (Network) Show [har by Kirchhoff's voltage law
(Sec. 2.9), the CUlTents in the network in Fig. L50 are
obtained from the system
Y2(0) = 0
h(O) = 1,
46. y~ = -3Yl + Y2 - 12t, y~ = -4YI
Yl (0) = 0,
Y2(0) = 0
+
2Y2
+
12t,
)'~
= J2,
h(O)
=
=
0,
-4Yl
+ 8Ct -
)"2'
= 2.
= L6Y2,
}'~ = 16Y1,
h(O) = 2. y~(O) = 12,
R(i l
.f
11)
i 2 ) = v(t)
-
1.
+ C
12 =
O.
Solve this system. where R = 1 n, L = 2 H. C = 0.5
F. v(t) = 90e- t /4 V. i 1 (Q) = O. i 2 (Q) = 2 A.
'iT).
y~(O) = 3
50. y~
y~(O) = 4
MODELS OF CIRCUITS AND NETWORKS
51. (RC-circuit) Find and graph the CUlTent i(t) in the RCcircuit in Fig. 147, where R = 100 n, C = 10- 3 F,
v(t) = 100rV if 0 < t < 2, v(t) = 200 V if t > 2 and
the initial charge on the capacitor is O.
~c
R
+
R(12 -
)'2(0) = 0
49. y~ = 4Y2 - 4e t . y~ = 3Yl +
)'1 (0) = 1.
Y ~ (0) = 2. Y2(O)
Li~
.,
47. y~ = Y2, y~ = -5Yl - 2Y2,
h (0) = 0,
Y2(0) = 1
48. y~
RLC-circuit
L
w)c:~r oJ
Fig. 150.
Network in Problem 54
55. (Network) Set up the model of the network in Fig. 151
and find and graph the CUlTents. assuming that [he
currents and the charge on the capacitor are 0 when the
switch is closed at t = O.
L=lH
vet)
Fig. 147.
c= 0.01
RC-circuit
52. (LC-circuit) Find and graph the charge
q(t) and the
current i(t) in the LC-circuit in Fig. 148, where
L = 0.5 H, C = 0.02 F, v(t) = 1425 sin 51 V if
Switch
Fig. 151.
R2 = 30 n
Network in Problem 55
F
269
Summary of Chapter 6
Laplace Transforms
The main purpose of Laplace transforms is the solution of differential equations and
systems of such equations, as well as corresponding initial value problems. The
Laplace transform Hs) = :£(f) of a function f(t) is defined by
F(s)
(1)
=
:£(f)
=
L'X)e-stf(t) dt
(Sec. 6.1).
o
This definition is motivated by the property that the differentiation of f with respect
to t corresponds to the multiplication of the transform F by s; more precisely,
5£(f') = s:£(f) - f(O)
(2)
:£(f") = s2:£(f) - sf(O) -
(Sec. 6.2)
f' (0)
etc. Hence by taking the transform of a given differential equation
y"
+ ay' +
by = ret)
(a, b constant)
and writing :£(y) = Yes), we obtain the subsidiary equation
(4)
(S2
+
as
+
b)Y
=
:£(r)
+
sf(O)
+ t' (0) +
afCO).
Here, in obtaining the transform :£(r) we can get help from the small table in
Sec. 6.1 or the larger table in Sec. 6.9. This is the first step. In the second step we
solve the subsidiary equation algebraically for Y(s). In the third step we determine
the inverse transform yet) = :;e-l(y). that is, the solution of the problem. This is
generally the hardest step. and in it we may again use one of those two tables. Yes)
will often be a rational function, so that we can obtain the inverse :£-1(Y) by partial
fraction reduction (Sec. 6.4) if we see no simpler way.
The Laplace method avoids the determination of a general solution of the
homogeneous ODE. and we also need not determine values of arbitrary constants
in a general solution from initial conditions; instead, we can insert the latter directly
into (4). Two further facts account for the practical importance of the Laplace
transform. First, it has some basic properties and resulting techniques that simplify
the determination of transforms and inverses. The most important of these properties
are listed in Sec. 6.8, together with references to the corresponding sections. More
on the use of unit step functions and Dirac's delta can be found in Secs. 6.3 and
6.4, and more on convolution in Sec. 6.5. Second, due to these properties, the present
method is particularly suitable for handling right sides r(t) given by different
expressions over different intervals of time, for instance, when ret) is a square wave
or an impulse or of a form such as ret) = cos t if 0 ~ t ~ 47T and 0 elsewhere.
The application of the Laplace transform to systems of ODEs is shown in
Sec. 6.7. (The application to PDEs follows in Sec. 12.11.)
PA RT
_.1
B
Linear Algebra.
Vector Calculus
.~
'.'
•
CHAPTER
7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
CHAPTER
8
Linear Algebra: Matrix Eigenvalue Problems
CHAPTER
9
Vector Differential Calculus. Grad, Div, Curl
CHAPTER 10
Vector Integral Calculus. Integral Theorems
Linear algebra in Chaps. 7 and 8 consists of the theory and application of vectors and
matrices, mainly related to linear systems of equations, eigenvalue problems, and linear
transformations.
Linear algebra is of growing importance in engineering research and teaching because it
forms a foundation of numeric methods (see Chaps. 20-22), and its main instruments,
matrices, can hold enormous amounts of data-think of a net of millions of telephone
connections-in a form readily accessible by the computer.
Linear analysis in Chaps. 9 and 10. usually called vector calculus, extends differentiation
of functions of one variable to functions of several variables-this includes the vector
differential operations grad, div, and curl. And it generalizes integration to integrals over
curves, surfaces, and solids, with transformations of these integrals into one another, by
the basic theorems of Gauss, Green, and Stokes (Chap. 10).
Software suitable for linear algebra (Lapack, Maple, Mathematica, Matlab) can be found
in the list at the opening of Part E of the book if needed.
Numeric linear algebra (Chap. 20) can be studied directly after Chap. 7 or 8 because
Chap. 20 is independent of the other chapters in Part E on numerics.
271
CHAPTER
..
., ,-,
•
'\
7
-~
Linear Algebra: Matrices,
Vectors, Determinants.
Linear Systems
This is the first of two chapters on linear algebra, which concerns mainly systems of
linear equations and linear transformations (to be discussed in this chapter) and eigenvalue
problems (to follow in Chap. 8).
Systems of linear equations, briefly called linear systems, arise in electrical networks,
mechanical frameworks. economic models_ optimization problems, numerics for
differential equations, as we shall see in Chaps. 21-23, and so on.
As main tools. linear algebra uses matrices (rectangular arrays of numbers or functions)
and vectors. Calculations with matrices handle matrices as single objects, denote them by
single letters, and calculate with them in a very compact form, almost as with numbers,
so that matrix calculations constitute a powerful "mathematical shorthand".
Calculations with matrices and vectors are defined and explained in Secs. 7.1-7.2.
Sections 7.3-7.8 center around linear systems, with a thorough discussion of Gauss
elimination, the role of rank. the existence and uniqueness problem for solutions (Sec. 7.5),
and matrix inversion. This also includes determinants (Cramer's rule) in Sec. 7.6 (for
quick reference) and Sec. 7.7. Applications are considered throughout this chapter. The
last section (Sec. 7.9) on vector spaces, inner product spaces, and linear transformations
is more abstract. Eigenvalue problems follow in Chap. 8.
COMMENT. Numeric linear algebra (Sees. 20.1-20.5) call be studied immediately
after this chapter.
Prerequisite: None.
Sections thatma)" be omitted in a short course: 7.5, 7.9.
References lind Answers to Problems: App. I Part B, and App. 2.
7.1
Matrices, Vectors:
Addition and Scalar Multiplication
In this ~ection and the next one we introduce the basic concepts and rules of matrix and
vector algebra. The main application to linear systems (systems of linear equations) begins
in Sec. 7.3.
272
SEC. 7.1
Matrices, Vectors: Addition and Scalar Multiplication
273
A matrix is a rectangular array of numbers (or functions) enclosed in brackets. These
numbers (or fUnctions) are called the entries (or sometimes the elements) of the matrix.
For example,
-0.2 ~:J '
[0~3
(1)
x
[ee 6x
2
2X
4x
J,
a,,]
ran
a12
a2l
a 22
°23
a3l
a32
a33
[al
a2
a3]'
[:J
are matrices. The first matrix has two rows (horizontal lines of entries) and three columns
(vertical lines). The second and third matrices are square matrices, that is, each has as
many rows as columns (3 and 2, respectively). The entries of the second matrix have two
indices giving the location of the entry. The first index is the number of the row and the
second is the number of the column in which the entry stands. Thus, a23 (read a 111'0 three)
is in Row 2 and Column 3, etc. This notation is standard, regardless of whether a matrix
is square or not.
Matrices having just a single row or column are called vectors. Thus the fourth matrix
in (l) has just one row and is called a row vector. The last matrix in (1) has just one
column and is called a column vector.
We shall see that matrices are practical in various applications for storing and processing
data. As a first illustration let us consider two simple but typical examples.
E X AMP L E 1
Linear Systems, a Major Application of Matrices
In a system of linear equations, briefly called a linear system, such as
the coefficients of the unknowns
Xl, X2, X3
are the entries of the coefficient matrix, call it A,
The matrix
6
9
o
-2
-8
is obtained by augmenting A by the right sides of the linear system and is called the augmented matrix of the
system. In A the coefficients of the system are displayed in the pattern of the equations. That is, their position
in A corresponds to that in the system when written as shown. The same is true for A.
We shall see that the augmented matrix A contains all the informatIon about the solutions of a system,
so that we can solve a system just by calculations on its augmented matrix. We shall discuss this in great
detail, beginning in Sec. 7.3. Meanwhile you may verify by substitution that the solution is xl = 3, x2 =
!,
X3
= -1.
The notation
letters.
Xl, X2, X3
for the unknowns is practical but not essential; we could choose x, y,
Z
or some other
•
274
E X AMP L E 2
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Sales Figures in Matrix Form
Sales figures for three products I. II. !II in a store on Monday (M). Tuesday (T). ... may for each week be
arranged in a matrix
M
A=
[7
100
T
Vv
Th
F
330
810
0
210
470]
120
780
SOO
SOO
960
II
0
0
270
430
780
III
S
If the company has ten stoTes. we can set up ten such matrices, one for each store. Then by adding corresponding
entries of these mmrices we can get a mmrix ,howing the IOtal sale~ of each product on each day. Can you think
of other data for which matrices are feasible? FOT instance. in transportation or storage problems? Or in recoTding
•
phone calls. or in li,ting distances in a network of roads?
General Concepts and Notations
We shall denote matrices by capital boldface letters A, B, C. ... ,or by writing the general
entry in brackets; thus A = [ajk], and so on. By an m x 11 matrix (read 171 by n matrix)
we mean a matrix with m rows and n columns-rows come always first! m X 11 is called
the size of the matrix. Thus an 17l X 11 matrix is of the form
(2)
The matrices in (I) are of sizes 2 X 3.3 X 3,2 X 2, I X 3. and 2 X l. respectively.
Each entry in (2) has two subscripts. The first is the row number and the second is the
column number. Thus (/21 is the entry in Row 2 and Column I.
If m = n, we call A an n X n square matrix. Then its diagonal containing the emries
a11, a22, . . . , ann is called the main diagonal of A. Thus the main diagonals of the two
square matrices in (1) are an, (/22' a33 and e- x , 4x, respectively.
Square matrices are particularly important. as we shall see. A matrix that is not square
is called a rectangular matrix.
Vectors
A vector is a matrix with only one row or column. Its entries are called the components
of the vector. We shall denote veCIOrs by lowercase boldface letters a, b, ... or by its
general component in brackets, a = [OJ], and so on. Our special vectors in (I) suggest
that a (general) row vector is of the form
For instance,
a = [-2 5 0.8 0
1].
SEC. 7.1
275
Matrices, Vectors: Addition and Scalar Multiplication
A column vector is of the form
b=
For instance,
Matrix Addition and Scalar Multiplication
What makes matrices and vectors really useful and particularly suitable for computers is
the fact that we can calculate with them almost as easily as with numbers. Indeed, we
now introduce rules for addition and for scalar multiplication (multiplication by numbers)
that were suggested by practical applications. (Multiplication of matrices by matrices
follows in the next section.) We first need the concept of equality.
D E FIN I T ION
Equality of Matrices
Two matrices A = [ajk] and B = [bjk ] are equal, written A = B, if and only if they
have the same size and the corresponding entries are equal, that is.
a11 = b 11 , a12 = b 12 , and so on. Matrices that are not equal are called different.
Thus, matrices of different sizes are always different.
E X AMP L E 3
Equality of Matrices
Let
and
B=
[43 -1OJ
= 4.
a12
Then
A=B
all
=
if and only if
o.
The following matrices aTe all different. Explain!
[~ ~J
DEFINITION
[~
3
2
~J
[~
4
Addition of Matrices
The sum of two matrices A = [ajk] and B = [bjkJ of the same size is written
A + B and has the entries ajk + bjk obtained by adding the corresponding entries
of A and B. Matrices of different sizes cannot be added.
As a special case, the sum a + b of two row vectors or two column vectors, which must
have the same number of components, is obtained by adding the corresponding
components.
276
E X AMP L E 4
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Addition of Matrices and Vectors
-4
If
.\ =
[
6
o
~J
and
B = [:
-I
°oJ·
then
A
+B
=
[~
5
2
A in Example 3 and our present A cannot be added. If a = [5 7 21 and b = [-6
a+b=[-I 9 21.
An application of matrix addition was suggested in Example 2. Many others will follow.
2
OJ. then
•
Scalar Multiplication (Multiplication by a Number)
DEFINITION
The product of any /1l X 1l matrix A = [ajk] and any scalar c (number c) is written
cA and is the I1l X 11 matrix cA = [cajk] obtained by mUltiplying each entry of A
by c.
Here (-I)A is simply written -A and is called the negative of A. Similarly, (- k)A is
written - kA. Also, A + (- B) is written A - B and is called the difference of A and B
(which must have the same size!).
E X AMP L E 5
Scalar Multiplication
If
A
=
2.7 -1.8]
[ 0 0.9
9.0
then
-A =
-2.7 1.8]
[ 0 -0.9
-9.0
-4.5
~OA=
4.5
[
3-2]
0
10
[0 0]
1,OA=0
-5
0
O.
0
If a matrix B shows the distances between some cities in miles. 1.60,)B gives these distances in kilometers. •
Rules for Matrix Addition and Scalar Multiplication. From the familiar laws for the
addition of numbers we obtain similar laws for the addition of matrices of the same size
111 X 11, namely,
(a)
A+B=B+A
(b)
(A + B) + C = A + (B + C)
(3)
A
(c)
(d)
(written A + B + C)
+0=A
A+(-A)=O.
Here 0 denotes the zero matrix (of size 111 X 11), that is. the III X 11 matrix with all entries
zero. (The last matrix in Example 5 is a zero matrix.)
Hence matrix addition is commutative and associative [by (3a) and (3b)].
Similarly, for scalar multiplication we obtain the rules
(4)
(a)
c(A + B) = cA + cB
(b)
(c + k)A = cA + kA
(c)
c(kA) = (ck)A
(d)
IA = A.
(written ckA)
SEC 7.1
. 1fD=B""l £:M::::S E T 7
11-81
~]
ADDITION AND SCALAR MULTIPLICATION
OF MATRICES AND VECTORS
Let
A =
[-~ ~ :l,
6
C
5
B =
-4J
~l.
= [:
1 3J
u~
[].
[-~
-:
-3
D
=
F
4
-:l,
oj
[-~ ~ l
-8 3J
[-~1
Find the following expressions or give reasons why they
are undefined.
1. C
277
Matrices, Vectors: Addition and Scalar Multiplication
+
D, D + C, 6(D - C), 6C - 6D
2. 4C, 2D, 4C + 2D, 8C - OD
3. A + C - D, C - D, D - C, B + 2C + 4D
4. 2(A + B), 2A
+
2B, 5A - ~B, A
+
B + C
5. 3C
8D, 4(3A). (4' 3)A, B - fDA
6.5A
3C, A - B + D, 4(B - 6A), 4B - 24A
7. 33u, 4v
+
15. (General rules) Prove (3) and (4) for general 3 X 2
matrices and scalars c and k.
16. TEAM PROJECT. Matrices in Modeling Networks.
Matrices have various applications, as we shall see,
m a form that the~e problems can be efficiently
handled on the computer. For instance, they can be
used to characterize connections in electrical
networks, in nets of roads, in production processes,
etc., as follows.
(a) Nodal incidence matrix. The network in Fig. 152
consists of 5 branches or edges (connections, numbered
1, 2, .. ·,5) and 4 nodes (points where two or more
branches come together), with one node being
grounded. We number the nodes and branches and give
each branch a direction shown by an arrow. This we
do arbitrarily. The network can now be described by a
"nodal incidence matrix" A = [ajk], where
(J)
- 1 if branch k enters node (j)
{
o if branch k does not touch (J).
+ 1 if branch k leaves node
Gjk =
Show that for the network in Fig. 152 the matrix A has
the given form
9u, 4(v + 2.25u), u - v
8. A + u, 12u + lOy, O(B - v), OB
+
u
9. (Linear system) Write down a linear system (as in
Example I) whose augmented matrix is the matrix B
in this problem set.
10. (Scalar multiplication) The matrix A in Example 2
shows the numbers of items sold. Find the matrix
showing the number of units sold if a unit consists of
(a) 5 items, (b) 10 items?
11. (Double subscript notation) Write the entries of A in
Example 2 in the general notation shown in (2).
12. (Sizes, diagonal) What sizes do A, B, C, D, u, v in
this problem set have? What are the main diagonals of
A and B, and what about C?
13. (Equality) Give reasons why the five matrices in
Example 3 are different.
14. (Addition of vectors) Can you add (a) row vectors
whose numbers of components are different, (b) a row
and a column vectOr with the same number of
components, (c) a vector and a scalar?
Branch
Node
CD
®
Node ®
Node
Node@
1
[-~
2
3
4
-1
-1
0
1
0
1
0
1
0
0
0
-1
5
-~l
Network and nodal incidence
matrix in Team Project 16(a)
Fig. 152.
(b) Find the nodal incidence matrices of the networks
in Fig. 153.
CHAP. 7
278
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
+ 1 if branch k is in mesh
[2J
and has the same orientation
11Ijk
~
-
1 if branch k is in mesh
[2J
and has the opposite orientation
o if branch k is not in mesh [2J
and a mesh is a loop with no branch in its interior (or
in its exterior). Here, the meshes are numbered and
directed (oriented) in an arbitrary fashion. Show that
in Fig. 154 the matrix M corresponds to the given
figure, where Row I corresponds to mesh I, etc.
Fig. 153.
Networks in Team Project 16{b)
(c) Graph the three networks corresponding to the
nodal incidence matrices
-I]
-I
o
-I
I
o
o
-1
o
-1
o
o
o
o
o
o
-1
-I
,
o
-I
o
o
o
0
-1
o
-I
o
Fig. 154.
-1
(d) Mesh incidence matrix. A network can also be
characterized by the mesh incidence matrix M = [mjkl.
where
7.2
M~ [~
o 0-1
o o
1
0
-1
0
0
0
1
-1
-1
1
0
1
0
1
0
0
~l
Network and matrix M in
Team Project 16{d)
Number the nodes in Fig. 154 from left to right I,
2, 3 and the low node by 4. Find the corresponding
nodal incidence matrix.
(e)
Matrix Multiplication
Matrix mUltiplication means multiplication of matrices by matrices. This is the last
algebraic operation to be defined (except for transposition, which is of lesser importance).
Now matrices are added by adding corresponding entries. In multiplication, do we multiply
corresponding entries? The answer is no. Why not? Such an operation would not be of
much use in applications. The standard definition of multiplication looks artificial, but
will be fully motivated later in this section by the use of matrices in "linear
transformations," by which this multiplication is suggested.
SEC. 7.2
Matrix Multiplication
279
Multiplication of a Matrix by a Matrix
DEFINITION
The product C = AB (in this order) of an 111 X 11 matrix A = [Gjk] times an r X p
matrix B = [bjk ] is defined if and only if r = 11 and is then the 111 X P matrix
C = [Cjk] with entries
n
(1)
Cjk
=
L
Gjtb/k
=
+
Gjl b lk
Gj2 b 2k
+ ... + Gjnbnk
t~l
j
= 1.···. m
k
=
L··· .p.
The condition r = n means that the second factor, B, must have as many rows as the first
factor has columns, namely n. As a diagram of sizes (denoted as shown):
A
C
B
[m X
11]
[n X r]
=
[111 X r].
in (1) is obtained by multiplying each entry in thejth row of A by the corresponding
entry in the kth column of B and then adding these 11 products. For instance,
C21 = G21bl1 + G22b21 + ... + G2nbnl, and so on. One calls this briefly a
"multiplicatioll of rows into columlls." See the illustration in Fig. 155, where 11 = 3.
Cjk
Fig. 155.
E X AMP L E 1
Notations in a product AB = C
Matrix Multiplication
3
AB = [ :
:
-6
-3
-:] [:
2
'0
=
1
-2
43
-16
14
4
-37
-9
on. The entry in the box is
("23 =
42]
6
- 28
4' 3 + O· 7 + 2 . 1 = 14.
•
Multiplication of a Matrix and a Vector
2J [3J = [4'3 + 2'5J = [22J
8
5
I . 3 + 8· 5
43
E X AMP L E 3
7
9-4
Here ell = 3 . 2 + 5 . 5 + (- I) . 9 = 22, and
The product BA is not defined.
E X AMP L E 2
-:
1] [2226
8
whereas
is undefined.
•
Products of Rowand Column Vectors
6
[3
6
12
24
•
280
E X AMP L E 4
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
CAUTION! Matrix Multiplication Is Not Commutative, AB
'* BA in General
This is illustrated by Examples I and 2. where one of the two products is not even defined. and by Example 3.
where the two products have different sizes. But it also holds for square matrices. For instance.
I] [99 -9999J .
but
100
=
-<)<)
It is interesting that this also shows that AB = 0 does 1101 necessarily imply BA = 0 or A = 0 or B = O. We
shall discuss this further in Sec. 7.8. along with reasons when this happens.
•
Our examples show that the order offactors in matrix products must always be obse",ed
vel)' carefully. Otherwise matrix multiplication satisfies rules similar to those for numbers,
namely.
(kA)B = k(AB) = A(kB)
(a)
written kAB or AkB
written ABC
A(BC) = (AB)C
(b)
(2)
+ mc = AC + BC
(c)
(A
(d)
C(A
+ B)
=
CA
+ CB
provided A, B, and C are such that the expressions on the left are defined; here, k is any
scalar. (2b) is called the associative law. l2c) and (2d) are called the distributive laws.
Since matrix mUltiplication is a multiplication of rows into columns. we can write the
defining formula (1) more compactly as
j = 1. .... 111:
(3)
k
=
1. .... p.
where aJ is the jth row vector of A and b k is the hh column vector of B, so that in
agreement with (1),
ll
b:
a Jn ]
']
=
aj1b1k
+
aj2 b2k
+ ... +
aj"bnk .
[
bnk
E X AMP L E 5
Product in Terms of Rowand Column Vectors
If A = [ajkl is of si/e 3
x 3 and B
=
[bjkl is of size 3 x 4. then
alb l
AB =
(4)
a2bl
[
a3 b l
Taking al
=
[3
5
-11. a2 = [4 0 2]. etc .. verify (4) for the product in Example
I.
•
Parallel processing of products on the computer is facilitated by a variant of (3) for
computing C = AB, which is used by standard algorithms (such as in Lapack). In this
method, A is used as given. B is taken in terms of its column vectors, and the product is
compUled columnwise: thus,
(5)
SEC 7.2
281
Matrix Multiplication
Columns of B are then assigned to different processors (individually or several IO each
processor), which simultaneously compute the columns of the product matrix Ab l , Ab2, etc.
E X AMP L E 6
Computing Products Columnwise by (5)
To obtain
o
4
AB = [
-5
4
7J - [11
-17
6
4
8
34J
-23
from (5), calculate the columns
of AB and theu wnte them as a single matrix, as shown in the first formula ou the right.
Motivation of Multiplication
•
by Linear Transformations
LeL us now motivate the "unnatural" matrix multiplication by its use in linear
transformations. For II = 2 variables these transformations are of the form
(6*)
and suffice to explain the idea. (For general n they will be discussed in Sec. 7.9.) For
instance, (6*) may relate an xlx2-coordinate system to a YIY2-coordinate system in the
plane. In vectorial form we can write (6*) as
(6)
y-
[
YI] -Ax.\'2
[au
1121
Now suppose further that the xlx2-system is related to a wlw2-system by another linear
transformation, say,
(7)
Then the )'IY2-system is related to the ~1'lw2-system indirectly via the x1x2-system, and we
wish to express this relation directly. Substitution will show that this direct relation is a
linear transformation, too, say,
(8)
e 11
y = Cw =
[
C21
Indeed, substituting (7) into (6), we obtain
Y1
=
all(b 11 H."1
=
Y2
+
b 12 W 2)
(allhll
= a21(b ll w l +
+
b 12 W 2)
= (a2I b 11
+ a12(b21 W I +
a12 h 21)wI
+
(a11hI2
+ a22(b21 W I +
+ a22 b21)WI +
b 22 W 2)
+
a12b22)W2
b22W2)
(a21 b I2
+
a22 b 22)W2'
282
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Comparing this with (8), we see that
C12
=
an b 12
+
a12 b 22
C2 2
=
a21 b 12
+
a22 b 22'
This proves that C = AB with the product defined as in (I). For larger matrix sizes the
idea and result are exactly the same. Only the number of variables changes. We then have
III variables y and n variables x and p variables w. The matrices A, B, and C = AB then
have sizes III X Il, 11 X p. and m X p. respectively. And the requirement that C be the
product AB leads to formula (1) in its general form. This motivates matrix multiplication
completely.
Transposition
Transposition provides a transition from row vectors to column vectors and conversely.
More generally, it gives us a choice to work either with a matrix or with its transpose.
whatever will be more practical in a specific situation.
DEFINITION
Transposition of Matrices and Vectors
The transpose of an III X /I matrix A = [ajk] is the 11 X m matrix AT (read A transpose)
that has the rust row of A as its first column, the second row of A as its second
column. and so on. Thus the transpose of A in (2) is AT = [a/<J], written out
(9)
As a special case, transposition converts row vectors to column vectors and
conversely.
E X AMP L E 7
Transposition of Matrices and Vectors
If
5
-8
A= [
4
0
then
A little more compactly, we can write
J+: :l
-8
0
[:
[6
2
3]T
~
or
[38 -1
[:l [:r"
=
2
[30 -18J
3].
Note that for a square matrix. the transpose is obtained by interchanging entries that are symmetrically positioned
•
with respect to the main diagonal, e.g., a12 and a21. and so on.
SEC. 7.2
283
Matrix Multiplication
Rules for transposition are
(AT)T
(a)
(A
(b)
=
A
+ B)T =
AT
+ BT
(10)
(c)
(CA)T
=
CAT
(d)
(AB)T
=
BTAT.
CAUTION! Note that in (lOd) the transposed matrices are ill reversed order. We leave
the proofs to the student. (See Prob. 22.)
Special Matrices
Certain kinds of matrices will occur quite frequently in our work, and we now list the
most important ones of them.
Symmetric and Skew-Symmetric Matrices. Transposition gives rise to two useful
classes of matrices, as follows. Symmetric matrices and skew-symmetric matrices are
square matrices whose transpose equals the matrix itself or minus the matrix, respectively:
(thus akj = -ajk, hence ajj = 0).
(11)
Svmllletric Matri ...
E X AMP L E 8
Ske\\ oS) mmetric M.ltrix
Symmetric and Skew-Symmetric Matrices
120
200]
10
150
150
30
is symmetric. and
is skew-symmetric.
For instance, if a company has three building supply centers C1 , C2 , C3 , then A could show costs, say, ajj for
k) the cost of shipping 1000 bags from Cj to C k .
handling 1000 bags of cement on ceoter Cj , and ajl, (j
Clearly. ajk = lI~j because shipping in the opposite direction will usually cost the same.
Symmetric matrices have several general pmperties which make them importaot. This will be seen as we
proceed.
•
"*
Triangular Matrices. Upper triangular matrices are square matrices that can have
nonzero entries only on and above the main diagonal, whereas any entry below the diagonal
must be zero. Similarly, lower triangular matrices can have nonzero entries only on and
below the main diagonaL Any entry on the main diagonal of a triangular matrix may be
zero or not.
EXAMPLE 9
Upper and Lower Triangular Matrices
[~
:J.
4
3
[:
0
LIpper triangular
:l
()
-I
[:
6
:l [~ ~l
0
0
-3
0
0
2
9
3
Lo\\ er Irian!!ul"r
•
284
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Diagonal Matrices. These are square matrices that can have nonzero entries only on
the main diagonal. Any entry above or below the main diagonal must be zero.
If all the diagonal entries of a diagonal matrix S are equal. say, c, we call S a scalar
matrix because mUltiplication of any square matrix A of the same size by S has the same
effect as the multiplication by a scalar, that is,
AS
(12)
= SA = cA.
In particular, a scalar matrix whose entries on the main diagonal are all 1 is called a
unit matrix (or identity matrix) and is denoted by In or simply by I. For I, formula (12)
becomes
AI = IA = A.
(13)
E X AMP L E 1 0
Diagonal Matrix D. Scalar Matrix S. Unit Matrix I
•
Applications of Matrix Multiplication
Matrix multiplication will play a crucial role in connection with linear systems of
equations, beginning in the next section. For the time being we mention some other simple
applications that need no lengthy explanations.
E X AMP L E 11
Computer Production. Matrix Times Matrix
Supercomp Ltd produces two computer models PC I086 and PC 1186. The matrix A shows the cost per computer
(in thollsands of dollars) and B the production figures for the year 2005 (in multiples of 10000 units.) Find a
mutrix C that shows the shareholders the cost per quarter (in millions of dollars) for raw muterial. labor. and
miscellaneous.
Quarter
PCIOH6
A =
PCIIH6
1.2
1.6]
Raw Components
0.3
0.4
Labor
0.6
Miscellaneous
[0.5
2
3
8
6
PCI086
2
4
PCI186
4
Solutioll.
Quarter
2
C =AB =
3
[132
12.8
13.6
3.3
3.2
3.4
5.1
5.2
5.-l-
4
156]
Raw Components
3.9
Labor
6.3
Miscellaneous
Since cost b given io multiples of $1000 aod production in multiples of 10 000 units the eotries of Care
multiples of $10 millioos; thus ell = 13.2 means $132 miUion. etc.
•
SEC. 7.2
285
Matrix Multiplication
E X AMP L E 12
Weight Watching. Matrix Times Vector
Suppose that in a weight-watching program. a person of 1851b burns 350 callhr in walking (3 mph). 500 in
bycycling (13 mph) and 950 in jogging (5.5 mph). Bill. v.eighing 185 lb. plans to exercise according to lhe
matrix shown. Verify the calculmions (W = Walking. B = Bicycling. J = Jogging).
W
B
MON
0
LO 1.0
WED [ l.0
0
J
::] [=] ~ [I::]
FRI
1.5
SAT
2.0 1.5 1.0
U.5
MON
WED
1000
FRl
2400
SAT
•
950
EXAMPLE 13
Markov Process. Powers of a Matrix. Stochastic Matrix
Suppose that the 2004 state of land use in a city of 60 mi 2 of built-up area i~
C: Commercially Used 25<lc
I: Industrially Used 20%
R: Residentially Used 55%.
Find the stales in 2009, 2014. and 2019, assuming that the transition probabilitie~ for 5-year intervals are given
by the matrix A and remain practically the same over the time considered.
From C
A =
From [
FTomR
0.[
[0'
0.2
0.9
0.[
0
ToC
0°, ]
To I
0.8
ToR
A is a stochastic matrix, that is, a square matrix with all entries nonnegative and all column sums equal to I.
Our example concerns a Markov process1 , that is. a process for which thc probability of entering a certain state
depends only 00 the last state occupied (and the matrix A), not on any earlier state.
Solutioll.
From the matrix A and the 2004 state we can compute the 2009 state.
C
+ 0.[ ·20 + 0,55]
0.2 . 25 + 0.9' 20 + 0.2' 55
0.7'25
[
R
0.1,25
+
0·20
=
+ 0.8,55
[0.7
U.I
0.2
0.9
0.1
o
0.20]
0.8
[25]
20
55
[19.5]
34.0.
46.5
To explain: The 2009 figure for C equals 25o/c times the probability 0.7 that C goes into C, plus 20'7<
probability 0.1 that I goes into C, plus 55% times the probability U that R goes into C. Together,
25· 0.7
+ 20· 0.1 + 55' 0
=
19.5 ['k].
Also
time~
the
25' 0.2 + 20' 0.9 + 55· 0.2 = 34 [%].
Similarly. the new R is 46.5%. We see that the 200!) state vector is the column vector
y
=
[19.5
34.0
46.5]T
= Ax = A [25
20
55]T
where the column ~ector X = [25 20 55] T is the given 2004 state vector. Note that the ~um of the entries of
y is 100 ['7<]. Similarly. you may verify that for 2014 and 20[9 we get the state vectors
z = Ay = A(Ax) = A 2 x = [17.05
u
= Az
=
A~'
= A3x = [16.315
43.80
50.660
39.15]T
33.025]T.
lANDREI ANDREJEVITCH MARKOV (1856-1922), Russian mathematician, known for his work in
probability theory.
CHAP. 7
286
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
2
In 2009 the commercial area will be 19.5% (11.7 mi 2 ). the industrial 34% (20.4 mi ) and the
residential 46.5% (27.9 mi 2 ). For 2014 the conesponding figures are 17.05%.43.80%. 39.15o/r. For 2019 they
are 16.315%. 50.660%. 33.0:!5o/c. (In Sec. 8.2 we shall see what happens in the limit. assuming that those
probabilities remain the same. In the meantime. can you experiment or guess?)
•
Answer.
-
__ ..............
___ ...,
•...
-.•.-- .....,J __....• ::-~
11-141
Let
A
=
MULTIPLICATION, ADDITION, AND
TRANSPOSITION OF MATRICES AND
VECTORS
l L~ =: -~J. l: : -~J'
B
-10
C
l:
5
=
1
-4
~ -~l· ~ [l ~
b
0
[3
11
0
81·
Calculate the following products and sums or give reasons
why they are not defined. (Show all intermediate results.)
1. Aa. Ab, Ab T, AB
2. Ab T + Bb T• (A + B)bT. bA. B - BT
3. AB, BA, AAT, ATA
4. A2. B2, (AT)2, (A2)T
5. aT A, bA, 5B(3a + 2b T). 15Ba + lOBb T
6. ATb, bTB, (3A - 2B)Ta, a T(3A - 28)
7. ab, ba, (ab)A, a(bA)
8. ab - ba, -(4b)(7a), -28ba, 5abB
9. (A + B)2, A2 + AB + BA + B2, A2 + 2AB + B2
10. (A + Bj(A - B). A2 - AB + BA - B2, A2 - B2
11. A2B, A3. (AB)2. A2B2
12. B 3 , BC, (BC)2. (BC)(BC)T
13. aTAa, aT(A + AT)a, bBb T. b(B - BT)b r
14. aTCCTa, a TC 2a. bCTCb T. bCCTb T
15. (General rules) Prove (2) for 2 X 2 matrices A = [ajk].
B = [bjk ]. C = [Cjk] and a general scalar.
16. (Corrunutativity) Find all 2 x 2 matrices A = [ajk]
that commute with B = [bjk ]. where bjk = j + k.
17. (Product) Write AB in Probs. 1-14 in terms of row
and column vectors.
18. (Product) Calculate AB in Prob. 1 column wise. (See
Example 6.)
19. TEAM PROJECT. Symmetric and SkewSymmetric J\;latrices. These matrices occur quite
frequently in applications. so it is worthwhile to study
some of their most important properties.
(a) Verify the claims in (11) that (/kj = ajk for a
symmetric matrix. and akj = -ajk for a skew-symmetric
matrix. Give examples.
(b) Show that for every square matrix C the matrix
C + C T is symmetric and C - C T is skew-symmetric.
Write C in the form C = S + T, where S is symmetric
and T is skew-symmetric and find Sand T in terms of
C. Represent A and B in Probs. 1-14 in this form.
(c) A linear combination of matrices A, B, C, ... ,
M of the same size is an expression of the form
(14)
aA
+
bB
+
cC
+ ... + 111M.
where a . ... , III are any scalars. Shuw that if these
matrices are square and symmetric, so is (14):
similarly, if they are skew-symmetric. so is (14).
(d) Shuw that AB with symmetric A and B is
symmetric if and only if A and B commute, that is.
AB = BA.
(e) Under what condition is the product of skewsymmetric matrices skew-symmetric?
20. (Idempotent and nilpotent matrices) By definition,
A is idempotent if A2 = A, and B is nilpotent if
Bm = 0 for some positive integer 111. Give examples
(different from 0 or I). Also give examples such that
A2 = I (the unit matrix).
21. (Triangular matrices) Let VI' V 2 be upper triangular
and L 1. L2 lower triangular. Which of the following
are triangular? Give examples. How can you save half
of your work by transposition?
U1 + U2, V\V2 , V12. VI + L 1. U1L 1, L1
+
L 2.
L 1L 2, L12
22. (Transposition of products) Prove (lOa)-(lOc).
lllustrate the basic formula (lOd) by examples of your
own. Then prove it.
APPLICATIONS
23. (Markov process) If the transition matrix A has the
entries all = 0.5, a12 = 0.3, a21 = 0.5, (/22 = 0.7 and
the initial state is [1 1] T, what will the next three
states be?
24. (Concert subscription) In a community of 300000
adults, subscribers to a concert se1ies tend to renew their
SUbSCliption with probability 90% and persons presently
not SUbsClibing will subscribe for the next season with
probability 0.1 %. If the present number of subscribers
is 2000, can one predict an increase, denease, or no
change over each of the next three seasons?
SEC. 7.3
Linear Systems of Equations. Gauss Elimination
25. CAS Experiment. Markov Process. Write a program
for a Markov process. Use it to calculate further steps in
Example 13 of the text. Experiment with other stochastic
3 X 3 matlices, also using different starting values.
26. (Production) In a production process, let N mean "no
trouble" and r'trouble." Let the transition probabilities
from one day to the next be 0.9 for N --> N, hence 0.1
for N --> T, and 0.5 for T --> N, hence 0.5 for T --> T.
If today there is no trouble, what is the probability of
N two days after today? Three days after today?
27. (Profit vector) Two factory outlets Fl and F2 in New
York and Los Angeles sell sofas (S), chairs (C). and
tables (T) with a profit of $110, $45, and $80,
respectively. Let the sales in a certain week be given by
the matrix
T
S
c
A
=
400
[600
300
100J
205
820
Introduce a "profit vector" p such that the components
of v = Ap give the total profits of Fl and F 2 .
28. TEAM PROJECT. Special Linear Transformations.
Rotations have various applications. We show in this
project how they can be handled by matrices.
(a) Rotation in the plane. Show that the linear
transformation y = Ax with matrix
A = [COS 8
sin 8
-sin
8]
and
(c) Addition formulas for cosine and sine. By
geometry we should have
[
c~s a
sm a
-sin
a]
cos a
COS
f3
[ sin
f3
=
[cos
(a
sin
(a
-sin f3]
cos
f3
m
+
+ (3)
-sin (a
+
m] .
cos (a + (3)
Del;ve from this the addition formulas (6) in App. A3.1.
(d) Computer graphics. To visualize a threedimensional object with plane faces (e.g., a cube), we
may store the position vectors of the vertices with
respect to a suitable XIX2x3-coordinate system (and a
list of the connecting edges) and then obtain a twodimensional image on a video screen by projecting
the object onto a coordinate plane, for instance, onto
the xlx2-plane by setting -'"3 = O. To change the
appearance of the image. we can impose a linear
transformation on the position vectors stored. Show
that a diagonal matrix D with main diagonal entries
3, 1, ~ gives from an x = [Xj] the new position vector
y = Dx, where Yl = 3Xl (stretch in the Xl-direction
by a factor 3), Y2 = X2 (unchanged), }"3 = ~X3
(contraction in the x3-direction). What effect would a
scalar matrix have?
(e) Rotations in space. Explain y = Ax geometrically
when A is one of the three matrices
cos 8
y=
An =
[
0
[J
is a counterclockwi~e rotation of the Cartesian XIX2coordinate system in the plane about the origin. where
8 is the angle of rotation.
(b) Rotation through nO. Show that in (a)
COS 11f1
- sin I1f1J
sin 1If1
cos 118
Is this plausible? Explain this in words.
7.3
287
cos fI
l:
0
lOO' ·
Si~
cp
sin 8
cos fI
-,m.] l'~·
o
0
-'~n' 1
cos cp .
.
-sin '"
sin '"
cos '"
0
0
]
What effect would these transformations have in
situations such as that described in (d)?
Linear Systems of Equations.
Gauss Elimination
The most important use of matrices occurs in the solution of systems of linear equations,
briefly called linear systems. Such systems model various problems, for instance, in
frameworks, electrical networks, traffic flow, economics, statistics, and many others. In
this section we show an important solution method, the Gauss elimination. General
properties of solutions will be discllssed in the next sections.
288
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Linear System, Coefficient Matrix, Augmented Matrix
A linear system of m equations in 11 unknowns"\ b
. . . 'Xn
is a set of equations of the form
(1)
The system is called linear because each variable Xj appears in the first power only, just
as in the equation of a straight line. alb"', a mn are given numbers, called the
coefficients of the system. b I , . . . , bm on the right are also given numbers. [f all the bj
are zero, then (1) is called a homogeneous system. If at least one bj is not zero, then (1)
is called a nonhomogeneous system.
A solution of (1) is a set of numbers Xl' • . • • Xn that satisfies all the m equations.
A solution vector of (1) is a vector x whose components form a solution of (1). If the
system (1) is homogeneous. it has at least the trivial solution Xl = 0, .... Xn = O.
Matrix Form of the Linear System (1). From the definition of matrix multiplication
we see that the m equations of (1) may be written as a single vector equation
(2)
Ax
=b
where the coefficient matrix A = [ajk] is the
au
([12
a1n
(/21
a22
a2n
In
and
A=
amI
a m2
x n matrix
x=
and
b=
Q·'tnn
Xn
are column vectors. We assume that the coefficients (/jk are not all zero, so that A is not
a zero matrix. Note that x has 11 components, whereas b has III components. The matrix
A=
is called the augmented matrix of the system (1). The dashed vertical line could be
omitted (as we shall do later); it is merely a reminder that the last column of A does not
belong to A.
The uugmellted mutrix A determines the system (1) completely becam,e it contains all
the given numbers appearing in (1).
SEC. 7.3
289
Linear Systems of Equations. Gauss Elimination
E X AMP L E 1
Geometric Interpretation. Existence and Uniqueness of Solutions
If m
Unique solution
= 11 = 2. we have two equations in two unknowns Xl, X2
If we interpret Xl, X2 as coordinates in the xlx2-plane. then each of the two equations represems a slraight line.
and (Xl. -'"2) is a solution if and only if the point P with coordinates Xl' X2 lies on both lines. Hence there are
three possible cases:
(aJ Precisely one solution if the lines intersect.
(b) Infinitely many solutions if the lines coincide.
(c) No solution if the lines are parallel
For instance,
Xl +X2
2xl-x2
=1
=0
Xl +X2
2xl +
=
1
2x2 =
Case (b)
Case (a)
=
1
xl +X2 =
0
Xl +X2
2
Case (c)
x2
/
Infinitely
many solutions
~
.p
/
I
/
:I
xl
If the system is homogenous, Case (c) cannot happen. because then those two straight lines pass through the
origin. whose coordinates O. 0 constitute the trivial solution. If you wish, consider three equations in three
unknowns as representations of three planes in space and discuss the various possible cases in a similar fashion.
See Fig. 156.
•
Our simple example illustrates that a system (I) may perhaps have no solution. This poses
the following problem. Does a given system (1) have a solution? Under what conditions
does it have precisely one solution? If it has more than one solution, how can we
characterize the set of all solutions? How can we actually obtain the solutions? Perhaps
the last question is the most immediate one from a practical viewpoint. We shall answer
it first and discuss the other questions in Sec. 7.5.
Gauss Elimination and Back Substitution
No solution
Fig. 156. Three
equations in
three unknowns
interpreted as
planes in space
This is a standard elimination method for solving linear systems that proceeds
systematically irrespective of particular features of the coefficients. It is a method of great
practical importance and is reasonable with respect to computing time and storage demand
(two aspects we shall consider in Sec. 20.1 in the chapter on numeric linear algebra). We
begin by motivating the method. If a system is in "triangular form," say,
2Xl
+ 5X2 =
13x2
2
= -26
we can solve it by "back substitution," that is, solve the last equation for the variable.
= -26113 = -2, and then work backward, substituting X2 = -2 into the fIrst equation
X2
290
CHAP. 7
Linear Algebra: Matrices. Vectors. Determinants. Linear Systems
and solve it for Xl' obtaining Xl = ~(2 - 5x2 ) = ~(2 - 5· (-2» = 6. This gives us the idea
of fIrst reducing a general system to triangular form. For instance, let the given system be
-3~J .
5
-4Xl
+ 3x2 = - 30.
Its augmented matrix is
3
We leave the fust equation as it is. We eliminate Xl from the second equation. to get a triangular
system. For this we add twice the fIrst equation to the second, and we do the same operation
on the rows of the augmented matrix. This gives -4Xl + 4Xl + 3X2 + 10x2 = -30 + 2· 2,
that is,
2
-26
Row :2
+
2 Row I
[~
5
13
where Row :2 + :2 Row I means "Add twice Row 1 to Row T in the original matrix.
This is the Gauss elimination (for 2 equations in 2 unknowns) giving the triangular form,
from which back substitution now yields X2 = - 2 and Xl = 6, as before.
Since a linear system is completely determined by its augmented matrix, Gauss
elimination call be dOlle by merely considering the matrices, as we have just indicated.
We do this again in the next example. emphasizing the matrices by writing them first and
the equations behind them. just as a help in order not to lose track.
E X AMP L E 2
Gauss Elimination. Electrical Network
Solve the linear system
This is the ~ystem for the unknown currellIs
in the electrical network in Fig. 157. To obtain it. we label the currents as shown.
choosing directions arbitrarily: if a current will come out negative. this will simply mean that the current flows
against the direction of our arrow. The current entering each battery will be the same as the current leaving it.
The equations for the CUlTents result from Kirchhoff's laws:
Derivation from the circuit ill Fig. 157 (Optional).
Xl
=
i l . x2
=
i 2 • x3
=
i3
Kirchhoff's currellt law (KCL). At allY poim of a circuit. rhe sum of the illf/owillg Cl/rrems equals the
of fhe olltf/oll"illg ("lIrrellTs.
Sll111
Kirclzhoff's I'oltage law (KVL). 111 allY closed loop. the slim of all I'Olrage drops eqllals rhe impressed
electromotil'e force.
Node P gives the first equation, node Q the second, the right loop the third. and the left loop the fourth, as
indicated in the figure.
80
v~
rtJ
P
Fig. 157.
15Q
fWV
NodeP:
i1 -
i2
+
i3 =
NodeQ:
-ir +
i2
-
13
Right loop:
Left loop:
0
= 0
1Oi2 + 25i 3 = 90
20;1 + lOi2
Network in Example 2 and equations relating the currents
=80
SEC. 7.3
291
Linear Systems of Equations. Gauss Elimination
Solution by Gauss Elimination.
This system could be solved rather quickly by noticing its particular
form. But this is not the point. The point is that the Gauss elimination is systematic and will work in general,
also for large systems. We apply it to our system and then do back substitution. As indicated let us write the
augmented matrix of the system first and then the system itself:
'-'fI;I
Augmented Matrix
CD-I
Flimi",'e~
Pi""
A
Equations
Pivotl~~-
X2
-I
10
25
10
0
I
I
I
I
I
9:]
Elimlllate ~
80
Cl
20xl
+
X3
= 0
x3
= 0
+ 25x3
= 90
'\2 lOx2
+ IOx2
= 80
Step 1. Elimination of Xl
Call the first row of A the pivot row and the first equation the pivot equation. Call the coefficient I of its
xrterm the pivot in this step. Use this equation to eliminate Xl (get rid ot xl) in the other equations. For this, do:
Add I times the pivot equation to the second equation.
Add -20 times the pivot equation to the fourth equation.
This corresponds to row operations on the augmented matrix as indicated in BLUI behind the new matrix in
(3). So the operations are performed on the preceding matrix. The result is
-I
(3)
[;
X2
Xl -
0
0
to
25
30
-20
+
X3 =
Row 2.L Row I
0= 0
i]
80
0
IOx2
Row 4 - 20 Row I
+ 25x3
30x2 - 20x3
= 90
=
80.
Step 2. Elimination of X2
The first equation remains as it is. We want the new second equation to serve as the next pivot equation. But
since it has no x2-term (in fact, it is 0 = 0), we mllst first change the order of the equations and the corresponding
rows of the new mauix. We put 0 = 0 at the end and move the third equation and rhe fourth equation one place
up. This is called partial pivoting (as opposed to the rarely used total pivoting, in which also the order of the
unknowns is changed). It gives
l
-I
Pivotlt~
0
@
25
Eliminate 3,,~
:
~
-20
r
0
Xl -
;]
Pivot It
X3 =
0
25x3 = 90
Eliminate 30-'2 ~ 130x21- 2o.r3 = 80
80
0
+
,@;)+
X2
0
0 =
0
To eliminate X2' do:
Add -3 times the pivot equation to the third equation.
The result is
(4)
r~
-I
to
0
0
I' 0]
25:
-95
i-
01
90
190
X2
Xl -
I
IOx2
Row 3 - 3 Row 2
+
X3 =
+ 25x3
=
0
90
- 9SX3 = -190
0
0=
0
Back Substitution. Determination ofx3' x2' Xl (in this order)
Working backward from the last to the first equation of this "triangular" system (4), we can now readily find
x3, then .\'2, and then xl:
-95x3 = -190
90
X3 =;3
=
2lAJ
X2 = fo(90 - 25x3) = i2 = 4 [AJ
o
where A stands for "amperes." This is the answer to our problem. The solution is unique.
•
292
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Elementary Row Operations. Row-Equivalent Systems
Example 2 illustrates the operations of the Gauss elimination. These are the first two of
three operations. which are called
Elementary Row Operations for Matrices:
Interchange
(~f two
rows
Addition of a constant multiple of one row to another row
Multiplication of a row by a nonzero constant c.
CAUTION!
following
These operations are for rows, not for columns! They correspond to the
Elementary Operations for Equations:
Interchange of two equations
Addition of a constant multiple of one equation to another equation
Multiplication of an equation by a nonzero constant c.
Clearly, the interchange of two equations does not alter the solution set. Neither does that
addition because we can undo it by a corresponding subtraction. Similarly for that
multiplication, which we can undo by multiplying the new equation by lIc (since c =1= 0),
producing the original equation.
We now call a linear system SI row-equivalent to a linear system S2 if SI can be
obtained from S2 by (finitely many!) row operations. Thus we have proved the following
result, which also justifies the Gauss elimination.
THEOREM 1
Row-Equivalent Systems
Row-equivalent linear systems have the same set of solutions.
Because of this theorem, systems having the same solution sets are often called
equivalent systems. But note well that we are dealing with row operations. No column
operations on the augmented matrix are pennitted in this context because they would
generally alter the solution set.
A linear system (1) is called overdetermined if it has more equations than unknowns.
as in Example 2. determined if m = n. as in Example I. and underdetermined if it has
fewer equations than unknowns.
Furthermore, a system (1) is called consistent if it has at least one solution (thUS, one
solution or infinitely many solutions), but inconsistent if it has no solutions at all, as
Xl + X2 = I, Xl + X2 = 0 in Example l.
Gauss Elimination: The Three Possible Cases of Systems
The Gauss elimination can take care of linear systems with a unique solution (see Example
2), with infinitely many solutions (Example 3, below), and without solutions (inconsistent
systems; see Example 4).
SEC 7.3
293
Linear Systems of Equations. Gauss Elimination
E X AMP L E 3
Gauss Elimination if Infinitely Many Solutions Exist
Solve the following linear systems of three equatIons in four unknowns whose augmented matrix is
(S)
[3U
2.0
2.0
-S.O
0.6
I.S
I.S
-S.4
1.2
-0.3
-0.3
2.4
I
I
I
I
I
I
'U]
2.7
~ + 2.0X2 + 2·(l~3 - S.OX4
.
Thus.
2.1
=
10.6X11: I.SX2: I.Sx3 - S.4x4 :
1.2Tl
0.3'\2
0.3X3
+
2.4x4 -
8.0
2.7
2.1.
Solutioll.
As in the previous example. we circle pivots and box terms of equations and corresponding entries
to be eliminated. We indicate the operations in terms of equations and operate on both equations and matrices.
Step 1. Elimillation O/Xl from the second and third equations by adding
- 0.6/3.0 = -0.2 times the first equation to the second equation,
- 1.2/3.0
~
-0.4 times the first equation to the third equation.
This gives the following, in which the pivot of the next step is circled.
2.0
l.l
1.1
-4.4
-1.1
-1.1
4.4
(6)
Step 2. Elimillatioll
3.0Xl + 2.0X2 + 2.Ox3 - S.Ox4 =
!l.0
Row 2 - 0.2 Row I
~+ 1.1x3 - 4.4x4 =
l.l
Row 3 - OA Row I
1-l.1x21- 1.1x3 + 4.4x4 = -
l.l
-S.O
2.0
0/ x2 from
8.0]
1.1
-1.1
the third equation of (6) by adding
1.111.1 = I times the second equation to the third equation.
This gives
(7)
2.0
2.0
1.1
1.1
o
-S.O
i
-4.4 I
8.0
8.0]
l.l
1.1
I
o
010
RO\\ 3 ;- RO\\ 2
0=
o
Back Substitution. From the second equation. X2 = 1 - X3 + 4x4' From this and the fIrst equation.
Xl = 2 - X4' Since x3 and x4 remain arbitrary. we have infinitely many solutions. If we choose a value of
x3 and a value of X4. then the corresponding values of Xl and x2 are uniquely determined.
If unknowns remain arbitrary. it is al~o customary to denote them by other letters 11, 12•....
In this example we may thus write Xl = 2 - X4 = 2 - 12. x2 = I - x3 + 4X4 = I - '1 + 412. x3 = 11 (flrst
arbitrary unknown), X4 = 12 (second arbitrary unknown).
•
Oil Notation.
E X AMP L E 4
Gauss Elimination if no Solution Exists
What will happen if we apply the Gauss elimination to a linear system that has no solution? The answer is that
in this case the method will show this fact by producing a contradiction. For instance. consider
3 2 1: 3]
[
2
1
: 0
4
I 6
I
6
2
@+ 2~2 +
X3
~+
X2
+
X3 = 0
~+
2X2
+
4x3 = 6.
=
3
Step 1. Eliminatioll o/x] from the second and third equations by adding
-~ time, the fIrst equation to the second equation.
-i =
-2 times the first equation to the third equation.
This give,
2
1
[:
-3
-2
3xl
+
2~2
+
x3 =
3
: 3J
:-2
Row ]. - ~ Ron 1
(B+
- 3x 2
2
I
I
RO\I J - 2 Row I
1- U2/+ U3 =
1
0
3
-2
31 x 3-
O.
294
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Step 2. Elimillatioll of X2 from the third equation gives
2
1! ::-23]
o
o
The false statement 0
~
I
I
12
Rm" 3
6 Ro\\ 2
12 shows that the system has no
0= 12.
•
~olution.
Row Echelon Form and Information From It
At the end of the Gauss elimination the form of the coefficient matrix, the augmented
matrix, and the system itself are called the row echelon form. In it. rows of zeros. if
present. are the la"t rows. and in each nonzero row the leftmost nonzero entry is farther
to the right than in the previous row. For instance. in Example 4 the coefficient matrix
and its augmented in row echelon fonn are
2
1
[:
-3
0
2
:]
and
1
[:
1
-3
3"
0
0
-+
12
Note that we do not require that the leftmost nonzero entries be I since this would have
no theoretic or numeric advantage. (The so-called reduced echelon form, in which those
entries are I, will be discussed in Sec. 7.8.)
At the end of the Gauss elimination (before the back substitution) the row echelon form
of the augmented matrix will be
(8)
Here, r ~ 111 and (/11 =1= 0, C22 =1= 0, ... , k1T =1= 0, and all the entries in the blue triangle
as well as in the blue rectangle are zero. From this we see that with respect to solutions
of the system with augmented matrix (8) (and thus with respect to the originally given
system) there are three possible cases:
(a) Exactly one solution if r = 1l and b,,+ .. .... bm' if present. are zero. To get the
solution. solve the nth equation corresponding to (8) (which is knnxn = bn) for X n ' then
the (n - l)st equation for Xn-l, and so on up the line. See Example 2, where r = n = :3
and I7l = 4.
(b) Infinitely many solutions if r < 11 and b,.+!, .... bm' if present, are zero. To obtain
any of these solutions, choose values of X r + l , ••• 'Xl1 arbitrarily. Then solve the 7th equation
for x,., then the (1' - l)st equation for X,._!, and so on up the line. See Example 3.
(c) No solution if r < 111 and one of the entries br + I'
4, where r = 2 < m = 3 and br + 1 = b3 = 12.
•.. ,
bm
is not zero. See Example
SEC. 7.3
11-161
GAUSS ELIMINATION AND BACK
SUBSTITUTION
20.9
2. 3.0x
+
6.2y = 0.2
-x + 4y = -11).3
2.lx
+
8.5y = 4.3
1. 5x - 2y
+
3. 0.5x
=
3.5)'
=
5.7
-x + 5.0.1'
=
7.8
4.
4x
5. 0.8x + 1.2.1' - 0.6:::
=
-7.8
+ 1.7z
=
15.3
4.0x - 7.3y - 1.5::: =
l.l
2.6x
6. 14x
-
2y
-
47
= 0
18x
-
2}'
-
6;:
= 0
+
8)'
-
J4z
= 0
4x
8. 2x + y
y + 3z
=
-1
-4x + 2y - 6z =
2
- 2y - 2:
4y - 5z
=
-4w -2w
14.
+
IV -
-
31-1'
+
7
81<'
+
4y - 2:::
=
0
+
=
13
+
29
8y - 4::: = 24
-
6x
2::
17x + 4y + 3: =
7w
2x
0
+
3y
+
8y - 6: = -20
511' - 13x -
y
-
+
2- =
5.: =
0
16
-=
x + y +
<.
=
4y
+
4:::
=
24
27
=
-6
=
18
11)'
-
17y
+
-2
= -12
z
2
f3.
A:2A I
~
1/2 ViC I
'L~
_Eo
x 3x
+
19·~1!;OQ
5Q
~tJ--~3_5_V___- ,
13
=
y
-1.3
8
-
x + y - 2:
13.
=
0.3)' - 0.4: = -1.9
=
+
=
6;:
9.
-4.6x + 0.5)' + 1.2z
3x
z
+
-
12.
2
4y
6x
2x -
=
+
8x - y + 7z =0
11.
8y + :
16. -211'
Y
3x
+
+
17.~
+ 2::: = 3
0.6x
511' + 4x
MODELS OF ELECTRICAL NETWORKS
Using Kirchhoff's laws (see Example 2), find the currents.
(Show the details of your work.)
-
5x
7y - 4: = -46
117-191
7.
3: = 8
-
+
+
-lI'
4y - 2;::
6x - 2)' +
3x
15.
Solve the following systems or indicate the nonexistence of
solutions. (Show the details of your work.)
10.
295
Linear Systems of Equations. Gauss Elimination
+
0
=
2:: = -4
3y - 62
=
-2
2x + 5y - 3z = 0
6x
+
v
+ :
2w - 4x + 3y -
z
=
0
=
3
Wheatstone bridge
Net of one-way streets
(Prob. 20, next page)
(Prob. 21, next page)
296
CHAP. 7
Linear Algebra: Matrices, Vectors. Determinants. Linear Systems
20. (Wheatstone bridge) Show that if RxlR3
= Rl/R2 in
the figure. then T = O. (Ro is the resistance of the
instrument by which I is measured.) This bridge is a
method for determining R.r . R I • R 2 • R3 are known. R3
is variable. To get Rx. make I = 0 by varing R 3 . Then
calculate Rx = R 3R I /R z .
to do row operations directly,
mUltiplication by E.)
(a) Show that the following are elementary matrices,
for interchanging Rows 2 and 3. for adding -5 times
the first row to the third, and for mUltiplying the fourth
row by 8.
21. (Traffic flow) Methods of electrical circuit analysis
have applications to other fields. For instance, applying
the analog of Kirchhoff's current law, find the traffic
flow (cars per hour) in the net of one-way streets (in
the directions indicated by the an'ows) shown in the
figure. Is the solution unique?
22. (Model., of markets) Determine the equilibrium
solution (D 1 = SI, D2 = S2) of the two-commodity
market with linear model (D, S, P = demand, supply,
price: index I = first commodity. index 2 = second
commodity)
-5
+ 14
o
Dl
=
60 - 2P I
D2 = 4Pl - P 2
-
P2•
4P l
-
2P 2
+
10.
5P 2
-
2.
23. (Equiyalence relation) By definition, an equivalence
relation on a set is a relation satisfying three conditions
(named as indicated):
(i) Each element A of the set is equivalent to itself
( "R~f7exivity").
(iil If A is equi\'alent to B. then B is equivalent to A
("SYlIlllletn- ").
(iii) If A is equivalent to B and B is equivalent to C,
then A is equivalent to C ("Transitivity").
Show that row equivalence of matrices satisfies these
three conditions. Him. Show that for each of the three
elementary row operations these conditions hold.
rather than by
o
o
o
o
o
o
1
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
o
8
Apply E10 E 2 , E3 to a vector and to a 4 X 3 matrix of
your choice. Find B = E3E2EIA, where A = [ajk] is
the general 4 X 2 matrix. Is B equal to C = EIE2E3A?
(b) Conclude that Eb E 2 , E3 are obtained by doing
the corresponding elementary operations on the 4 X 4
unit matlix. Prove that if lVl is obTained from A hy an
elementary rOil' operation. then
M=EA,
24. PROJECT. Elementary Matrices. The idea is that
elementary operations can be accomplished by matrix
multiplication. If A is an 111 X Il matrix on which we
want to do an elementary operation, then there is a
matrix E such that EA is the new matrix after the
operation. Such an E is called an elementary matrix.
This idea can be helpful, for instance. in the design of
algorithms. (Computationally, it is generally preferable
7.4
where E is obtained from the
the same row operation.
11
X
Il
unit matrix In by
25. CAS PROJECT. Gauss Elimination and Back
Substitution. Write a program for Gauss elimination
and back substitution (a) that does not include pivoting,
(b) that does include pivoting. Apply the programs to
Probs. 13-16 and to some larger systems of your choice.
Linear Independence. Rank of a Matrix.
Vector Space
In the last section we explained the Gauss elimination with back substitution, the most
important numeric solution method for linear systems of equations. It appeared that such
a system may have a unique solution or infinitely many solutions. or it may be inconsistent,
that is, have no solution at alL Hence we are confronted with the questions of existence
and uniqueness of solutions. We shall answer these questions in the next section. As the
SEC. 7.4
297
Linear Independence. Rank of a Matrix. Vector Space
key concept for this (and other questions) we introduce the rallk of a matrix. To define
rank, we first need the following concepts, which are of general importance.
Linear Independence and Dependence of Vectors
Given any set of 111 vectors ~1J' . • • , ~m) (with the same number of components), a linear
combination of these vectors is an expression of the form
where
Cl' C2, ••• ,
em are any scalars. Now consider the equation
(1)
Clearly, this vector equation (I) holds if we choose all c/s zero, because then it becomes
O. [f this is the only m-tuple of scalars for which (1) holds, then our vectors
a(1), ... , a('m) are said to fOlm a linearly independent set or, more briefly, we call them
linearly independent. Otherwise, if (1) also holds with scalars not all zero, we call these
vectors linearly dependent, because then we can express (at least) one of them as a
linear combination of the others. For instance, if (l) holds with, say, Cl =1= 0, we can
solve (I) for a(1):
o=
(Some k/s may be zero. Or even all of them, namely, if a(1) = 0.)
Why is this important? Well, in the case of linear dependence we can get rid of some
of the vectors until we anive at a linearly independent set that is optimal to work with
because it is smallest possible in the sense that it consists only of the "really essential"
vectors, which can no longer be expressed linearly in terms of each other. This motivates
the idea of a "basis" used in various contexts, notably later in our present section.
E X AMP L E 1
Linear Independence and Dependence
The three vectors
3
0
2
2]
[-6
42
24
54J
= [21
-21
o
-15]
3{l)=[
3(2) =
3(3)
are linearly dependent because
Although this is easily checked (do it!), it is not so ea~y to discover. However. a systematic method for finding
out about linear independence and dependence follows below.
The first two of the three vectors are linearly independent because c13m + c23c2) = 0 implies c2 = 0 (from
•
the second components) and then C1 = 0 (from any other component ot 3(U)'
Rank of a Matrix
DEFINITION
The rank of a matrix A is the maximum number of linearly independent row vectors
of A. It is denoted by rank A.
298
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Our further discussion will show that the rank of a matrix is an important key concept for
understanding general properties of matrices and linear systems of equations.
EXAMPLE 2
Rank
The matnx
[
(2)
2
42
24
54
-21
0
-15
J
~ = ~~
']
0
has rank 2. because Example 1 shows that the first two TO'" vectors are linearly independent. whereas all three
row vectors are linearly dependent.
Note further that rank A = 0 if and only if A - O. This follows directly from the definition.
•
We call a matrix Al row-equivalent to a matrix A2 if Al can be obtained from A2 by
(finitely many!) elementary row operations.
Now the maximum number of linearly independent row vectors of a matrix does not
change if we change the order of rows or multiply a row by an nonzero c or take a linear
combination by adding a multiple of a row to another row. This proves that rank is
invariant under elementary row operations:
THEOREM 1
Row-Equivalent Matrices
Row-equivalent matrices hal'e the slime rank.
Hence we can determine the rank of a matrix by reduction to row-echelon form
(Sec. 7.3) and then see the rank directly.
E X AMP L E 3
Determination of Rank
For the matrix in Example 2 we obtain successively
']
']
A+:
42
24
54
21
-21
0
-IS
3
0
2
0
42
28
58
0
-21
14
-29
3
0
2
0
42
28
0
0
0
[
[
0
2
':]
(given)
Row 2
+ 2 Row
I
Row 3 - 7 Row I
Row 3
+! Row 2
Since rank is defined in terms of two vectors, we immediately have the useful
THEOREM 2
Linear Independence and Dependence of Vectors
p vectors with 11 components each are linearly i1ldependent if the matrix with these
vectors as row vectors has rank p, but they are linearly dependent if that rank is
less than p.
•
SEC. 7.4
Linear Independence. Rank of a Matrix. Vector Space
299
Further impOltant properties will result from the basic
Rank in Terms of Column Vectors
THEOREM 3
The rank r of a matrix A equals the maximum number of linearly independent
column vectors of A.
Hence A alld its transpose AT have the same rallk.
PROOF
In this proof we write simply "rows" and "columns" for row and column vectors. Let
A be an 171 X n malIu of rank A = r. Then by definition of rank, A has r Linearly
independent rows which we denote by v(1), ... , V(T) (regardless of their position in A),
and all the rows a(l), • . • , a(m) of A are linear combinations of those, say,
(3)
These are vector equations for rows. To switch to columns, we write (3) in terms of
components as n such systems, with k = ], . . . , n,
alk
=
cnulk
+
Cl2 U 2k
+ ... +
CITUTk
a2k
=
C21 U lIc
+
C22 U 2k
+ ... +
C2T U Tk
({mk = CmlUlk
+
c'm2 u 2k
+ ... +
CmTUTk
(4)
and collect components in columns. Indeed. we can write (4) as
cn
({Ik
({2k
(5)
C12
C2 1
=
C22
+
U 1k
U2k
C",I
({mk
cIT
+ ... +
C2T
UTk
C.n~T
Cm.2
where k = I,· .. , n. Now the vector on the left is the hh column vector of A. We see
that each of these n columns is a linear combination of the same r columns on the right.
Hence A cannot have more Linearly independent columns than rows, whose number is
rank A = r. Now rows of A are columns of the transpose AT. For AT our conclusion is
that AT cannot have more linearly independent columns than rows, so that A cannot have
more linearly independent rows than columns. Together, the number of Linearly
independent columns of A must be r, the rank of A. This completes the proof.
•
E X AMP L E 4
Illustration of Theorem 3
The matrix in (2) has rank 2. From Example 3 we see that the first two row vectors are linearly independent
and by "working backward" we can verify that Row 3 = 6 Row I
Row 2. Similarly, the first two columns
are linearly independem. and by reducing the last matnx in Example 3 by columns we find that
-i
Column 3
=
~ Column I
+
~ Column 2
and
Column 4 = ~ Column I + ~ Column 2.
•
CHAP. 7
300
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Combining Theorems 2 and 3 we obtain
THEOREM 4
Linear Dependence of Vectors
p vectors witll n < p components are always linearly dependent.
PROOF
The matrix A with those p vectors as row vectors has p rows and 11 < P columns; hence by
•
Theorem 3 it has rank A ~ II < p, which implies linear dependence by Theorem 2.
Vector Space
The following related concepts are of general interest in linear algebra. In the present
context they provide a clarification of essential properties of matrices and their role in
connection with linear systems.
A vector space is a (nonempty) set V of vectors such that with any two vectors a and
b in Vall their linear combinations aa + f3b (a, f3 any real numbers) are elements of V,
and these vectors satisfy the laws (3) and (4) in Sec. 7.1 (written in lowercase letters a,
b, u, ... , which is our notation for vectors). (This definition is pre~ently sufficient.
General vector spaces will be discussed in Sec. 7.9.)
The maximum number of linearly independent vectors in V is called the dimension of
Vand is denoted by dim V. Here we assume the dimension to be finite; infinite dimension
will be defined in Sec. 7.9.
A linearly independent set in V consisting of a maximum possible number of vectors
in V is called a basis for V. Thus the number of vectors of a basis for V equals dim V.
The set of all linear combinations of given vectors a(l), . . . , alP) with the same
number of components is called the span of these vectors. Obviously, a span is a vector
space.
By a subspace of a vector space V we mean a nonempty subset of V (including V itself)
that forms itself a vector space with respect to the two algebraic operations (addition and
scalar multiplication) defined for the vectors of V.
E X AMP L E 5
Vector Space, Dimension, Basis
The span of the three vecrors in Example I is a vector space of dimension 2, and a basis is 3(1), 3(2), for instance,
or 3(l). 3(3), etc.
•
We further note the simple
THEOREM 5
Vector Space R"
Tile vector space Rn consisting of all vectors with n cOlllpOnel1lS
has dimension 11.
PROOF
A basis of
3cn) = [0
11
vectors is aU)
0 1].
[1
0
[0
(11
real numbers)
o
0], ... ,
•
In the case of a matrix A we call the span of the row vectors the row space of A and the
span of the column vectors the column space of A.
SEC. 7.4
Linear Independence. Rank of a Matrix. Vector Space
301
Now, Theorem 3 shows that a matrix A has as many linearly independent rows as
columns. By the definition of dimension, their number is the dimension of the row space
or the column space of A. This proves
Row Space and Column Space
THEOREM 6
The row space and the column space ofa matrix A have the same dimension, equal
to rank A.
Finally, for a given matrix A the solution set of the homogeneous system Ax = 0 is a
vector space, called the null space of A, and its dimension is called the nullity of A. In
the next section we motivate and prove the basic relation
rank A
(6)
11-121
+ nullity
A = Number of columns of A.
2
3
4
2
3
4
5
3
4
5
6
4
5
6
7
RANK, ROW SPACE, COLUMN SPACE
Find the rank and a basis for the row space and for the
column space. Hint. Row-reduce the matrix and its
transpose. (You may omit obvious factors from the vectors
of these bases.)
10.
2
4
16
8
842
16
11.
~J
b
4. [:
a
4
8
16
2
2
16
8
4
o
o
o
o
7
5
0
-7
5
0
:2
o
:2
0
12.
o
4
o
2
o
4
0
o
4
8
-2
3
-4
113--20 I
2
-3
4
-1
Are the following sets of vectors linearly independent?
(Show the details.)
2
3
-4
o
4
-[
8.
7.
o
3
0
o
5
8
-37
3
8
7
0
o
-37
o
37
9.
-2
2
-3
13. [3
[2
14. [1
LINEAR INDEPENDENCE
-2 0 4], [5 0 0
0 0 3]
0], [1 0 0]. [1
1], L-6
[
0
1]
15. [6 0 3 1 4 2], [0 -1
[12 3 0 -19 8 -11]
2
7
16. [3 4 7], [2 0 3], [8 2
17. [0.2 1.2 5.3 2.8 1.6],
[4.3 3.4 0.9 2.0 -4.3]
3], [5
0
5
5],
6]
I],
CHAP. 7
302
18. [3
2
19. [ I
I
[!
20. [I
2
~
5
2
I]. [0
0
0]. [4
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
3
25. If the row vectors of a square matrix are linearly
independent. so are the column vectors. and
conversely.
61
! !]. [~ ! ! H [! ! ! !].
! t]
3
4], [2
3
4
5]. [3
4
5
26. Give examples showing that the rank of a product of
matrices cannot exceed the rank of either factor
6],
[4 5 6 7]
21. CAS Experiment. Rank. (a) Show experimentally
that the 11 X II matrix A = [ajk] with lIjk = j + k - I
has rank 2 for any 11. (Problem 20 shows 11 = 4.) Try
to prove it.
(b) Do the same when lIjk = j + k + c. where c is
any positi ve integer.
(c) What is rank A if ajk = 2 j + k - 2 ? Try to find other
large matrices of low rank independent of 11.
122-261
PROPERTIES OF RANK
AND CONSEQUENCES
Show the following.
22. rank BTAT = rank AB. (Note the order!)
23. rank A = rank B does lIot imply rank A2
(Give a counterexample.)
VECTOR SPACES
Is the given set of vectors a vector space? (Give reason.) If
your answer is yes, determine the dimension and find a
basis. (Vb V2, • . . denote components.)
27. All vectors in
such that
VI
2V2 -
28. All vectors in
R4
such that
29. All vectors in
R3
with
30. All vectors in
R2
31. All vecrors in
R3
rank B2.
=
0
3v 4 =
k
VI
~ O. V 2 = -4V3
with
VI
~ V2
with
4VI
32. All vectors in R4 with
VI -
33. All vectors in R with
=
+ V2
R3
n
24. If A is not square, either the row vectors or the column
vectors of A are linearly dependent.
7.5
127-361
+
V3 =
V2 =
IvA ~
O.
0,
3v 2
=
V3
V3 = 5v I • v 4 =
I for j = I, ...
0
,11
34. All ordered quadruples of positive real numbers
35. All vectors in
R
5
with
VI = 2V2 = 3V3 = 4V4 = 5v5
36. All vectors in R4 with
3VI - V3 = O. 2VI + 3v 2
-
4V4 =
0
Solutions of Linear Systems:
Existence, Uniqueness
Rank as just defined gives complete information about existence, uniqueness, and general
structure of the solution set of linear systems as follows.
A linear system of equations in 11 unknowns has a unique solution if the coefficient matrix
and the augmented matrix have the same rank 11, and infinitely many solution ifthat common
rank is less than 11. The system has no solution if those two matrices have different rank.
To state this precisely and prove it. we shall use the (generally important) concept of
a submatrix of A. By this we mean any matrix obtained from A by omitting some rows
or columns (or both). By definition this inclUdes A itself (as the matrix obtained by omitting
no rows or columns); this is practical.
THEOREM 1
Fundamental Theorem for Linear Systems
(a) Existence. A linear SYSTem of m equaTions ill n unknowlls
(1)
Xl' . . . , Xn
SEC. 7.5
Solutions of Linear Systems: Existence. Uniqueness
303
is consistent, that is, has solutions, !f and only
augmented matrix A have the same rallk. Here,
A=
and
(f the coefficient matrix A and the
A=
Q
rnn
(b) Uniqueness. The system (l) has precisely one solution ~f and only !f this
common rank r of A and A equals n.
(c) Infinitely many solutions. {f this commOn rank r is less thann, the system
(l) has infinitely mallY solutions. All of these solutions are obtained by determining
r suitable unlmowns (whose submatrix of coefficients must have rank r) in tenl1S of
the remaining n - r unknowns, to which arbitrary values can be assigned. (See
Example 3 in Sec. 7.3.)
(d) Gauss elimination (Sec. 7.3). If solutions exist, they can all be obtained by
the Gauss elimination. (This method will automatically reveal whether or not
solutions exist; see Sec. 7.3.)
PROOF
(a) We can write the system (I) in vector form Ax = b or in terms of column vectors
c(l), • • • , c(n)
of A:
(2)
A is obtained by augmenting A by a single column b. Hence, by Theorem 3 in Sec. 7.4,
rank A equals rank A or rank A + 1. Now if (1) has a solution x, then (2) shows that b
must be a linear combination of those column vectors, so that A and A have the same
maximum number of linearly independent column vectors and thus the same rank.
Conversely, if rank A = rank A, then b must be a linear combination of the column
vectors of A, say,
(2*)
since otherwise rank
Xl
=
0'1' . . . • Xn
=
A=
an,
rank A + 1. But (2*) means that (1) ha<; a solution. namely,
as can be seen by comparing (2*) and (2).
(b) If rank A = n. the n column vectors in (2) are linearly independent by Theorem 3
in Sec. 7.4. We claim that then the representation (2) of b is unique because otherwise
This would imply (take all terms to the left, with a minus sign)
and Xl
scalars
-
Xl
0, ... , Xn - xn = 0 by linear independence. But this means that the
in (2) are uniquely determined, that is, the solution of (l) is unique.
Xl, . . . , Xn
304
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
(c) If rank A = rank A = I' < Il, then by Theorem 3 in Sec. 7.4 there is a linearly
independent set K of I' column vectors of A such that the other n - I' column vectors of
A are linear combinations of those vectors. We renumber the columns and unknowns,
denoting the renumbered quantities by A, so that (C(1), ... , c(r)} is that linearly independent
set K. Then (2) becomes
are linear combinations of the vectors of K, and so are the vectors
Expressing these vectors in terms of the vectors of K and collecting
terms, we can thus write the system in the form
CCr+l)' . . . , c(n)
Xr+IC(r+U' • . . • xnc(n)'
(3)
with Xi = Xj + {3j, where {3j resulls from the 11 - I' terms c(r+UXr+b ••. , c(n)xn ; here,
= 1, ... , r. Since the system has a solution, there are Yt> ... , Yr satisfying (3). These
scalars are unique since K is linearly independent. Choosing xr+l> ... , xn fixes the {3j
and corresponding Xj = )J - {3j, where j = I,' .. , r.
j
(d) This was discussed in Sec. 7.3 and is restated here as a reminder.
•
The theorem is illustrated in Sec. 7.3. In Example 2 there is a unique solution since
rank A = rank A = n = 3 (as can be seen from the last matrix in the example). In Example
3 we have rank A = rank A = 2 < n = 4 and can choose X3 and X4 arbitrarily. In Example
4 there is no solution because rank A = 2 < rank A = 3.
Homogeneous Linear System
Recall from Sec. 7.3 that a linear system (I) is called homogeneous if all the b/ s are
zero, and nonhomogeneous if one or several b/ s are not zero. For the homogeneous
system we obtain from the Fundamental Theorem the following results.
THEOREM 2
Homogeneous Linear System
A homogeneolls linear system
(4)
always hm the trivial solution Xl = 0, ... , Xn = O. Nontrivial solutions exist ~f and
ollly if rallk A < 11. ff rank A = I' < n, these solutions. together with x = 0, form a
vector :;pace (5ee Sec. 7.4) of dimension n - 1', called the solution space of (4).
III particular, !fXcl) and x(2) are solution vectors qf(4), then x = c l x(1) + C2Xc2)
with any sC(llars CI and C2 is a solution vector qf (4). (This does not hold for
nonhomogeneous systems. Also, the term solution space is used for homogeneous
systems only.)
SEC. 7.5
305
Solutions of Linear Systems: Existence, Uniqueness
PROOF
The first proposition can be seen directly from the system. It agrees with the fact that
b = 0 implies that rank A = rank A, so that a homogeneous system is always consistent.
If rank A = n, the trivial solution is the unique solution according to (b) in Theorem l.
If rank A < n, there are nontrivial solutions according to (c) in Theorem 1. The solutions
form a vector space because if x(l) and Xt.2) are any of them, then AX(1) = 0, AXt.2) = 0,
and this implies A(x(1) + X(2) = AX(l) + AX(2) = 0 as well as A(cx(1) = cAx(1) = 0,
where c is arbitrary. If rank A = r < n, Theorem I (c) implies that we can choose
n - r suitable unknowns. call them Xr+ 10 ••• , X n , in an arbitrary fashion, and every
solution is obtained in this way. Hence a basis for the solution space, briefly called a basis
of solutions of (4), is Y(1), •.• , Y(n-r), where the basis vector Y(j) is obtained by choosing
xr+j = 1 and the other xr+ 1, . . . , xn zero; the corresponding first I' components of this
solution vector are then determined. Thus the solution space of (4) has dimension n - r.
This proves Theorem 2.
•
°
The solution space of (4) is also called the null space of A because Ax = for every x
in the solution space of (4). Its dimension is called the nullity of A. Hence Theorem 2
states that
rank A + nullity A
(5)
=n
where n is the number of unknowns (number of columns of A).
Furthermore, by the definition of rank we have rank A ~ min (4). Hence if m < n,
then rank A < 11. By Theorem 2 this gives the practically important
THEOREM 3
Homogeneous Linear System with Fewer Equations Than Unknowns
A homogeneous linear system with fewer equations than unknowns has always
nontrivial solutions.
Nonhomogeneous Linear Systems
The characterization of all solutions of the linear system (I) is now quite simple. as follows.
THEOREM 4
Nonhomogeneous Linear System
!f a
nonhomogeneous linear system (l) is consistent. then all of its solutions are
obtained as
(6)
where Xo is any (fixed) solution Qf (l) and Xh runs through all the solutions Qf the
corresponding homogeneous system (4).
PROOF
The difference Xh = x - Xo of any two solutions of (1) is a solution of (4) because
xo) = Ax - Axo = b - b = 0. Since x is any solution of (1), we get all
the solutions of (l) if in (6) we take any solution Xo of (l) and let Xh vary throughout the
solution space of (4).
•
AXh = A(x -
306
7.6
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
For Reference:
Second- and Third-Order Determinants
We explain these determinants separately from the general theory in Sec. 7.7 because they
will be sufficient for many of our examples and problems. Since this section is for
reference, go on to the Ilext sectioll, consulting this material ollly when needed.
A determinant of second order is denoted and defined by
{/ll
(1)
D
= det A =
I
a21
So here we have bars (whereas a matrix has brackets).
Cramer's rule for solving linear systems of two equations in two unknowns
(2)
is
Xl =
Ihl
{/121
b2
{/22
b 1{/22
-
(/12 b2
D
D
(3)
la
X2
=
ll
bll
b2
a21
l/llb2
D
-
b 1{/21
D
with D as in (l), provided
D"*O.
°
The value D = appears for inconsistent nonhomogeneous systems and for homogeneous
systems with nontrivial solutions.
PRO 0 F
We prove (3). To eliminate X2' multiply (2a) by
Similarly, to eliminate
Xl' multiply (2a) by
"*
{/22
-a21
and (2b) by
and (2b) by
all
-a12
and add,
and add.
Assuming that D = all{/22 - {/12{/21
0, dividing, and writing the right sides of these
two equations as detelminants, we obtain (3).
•
SEC. 7.6
307
For Reference: Second- and Third-Order Determinants
E X AMP L E 1
Cramer's Rule for Two Equations
4XI + 3.\"2 = 12
2~1 +
= -8
If
then
xl =
5x2
112
-8 :1
4
1 2 :1
=
84
14
=6
'
X2 =
I:
I:
121
8
---
-56
14
-4.
•
:1
Third-Order Determinants
A determinant of third order can be defined by
12
a231_ a21 la
a33
a32
(4)
a131
a33
+ a31
12
la
a22
Note the following. The signs on the right are + - +. Each of the three terms on the
right is an entry in the first column of D times its minor, that is, the second-order
determinant obtained from D by deleting the row and column of that entry; thus. for all
delete the first row and first column, and so on.
If we write out the minors in (4), we obtain
Cramer's Rule for Linear Systems of Three Equations
(5)
IS
(6)
(D
*-
0)
with the determinant D afthe system given by (4) and
DI
hi
a12
al3
= h2
a22
a23 ,
h3
a32
a33
D2
all
hI
al3
= a2i
h2
a23 ,
a 31
h3
a33
D3
all
al2
hI
= a21
a22
h2
a3i
a32
h3
Note that D], D 2 , D3 are obtained by replacing Columns 1, 2, 3, respectively, by the
column of the right sides of (5).
Cramer's rule (6) can be derived by eliminations similar to those for (3), but it also
follows from the general case (Theorem 4) in the next section.
308
7.7
CHAP. 7
Linear Algebra: Matrices, Vectors. Determinants. Linear Systems
Determinants. Cramer's Rule
Determinants were originally introduced for solving linear systems. Although impractical
in computations, they have important engineering applications in eigenvalue problems
(Sec. 8.1), differential equations, vector algebra (Sec. 9.3), and so on. They can be
introduced in several equivalent ways. Our definition is particularly practical in connection
with linear systems.
A determinant of order n is a scalar associated with an 11 X 11 (hence square!) matrix
A = [ajk]' which is written
(1)
D=detA=
and is defined for n
=
(2)
and for n
(3a)
I by
D
~
= au
2 by
(j = 1. 2..... or n)
or
(3b)
Here,
and Mjk is a determinant of order n - I. namely, the determinant of the submatrix of A
obtained from A by omitting the row and column of the entry ajb that is, the jth row and
the kth column.
In this way, D is defined in terms of n determinants of order n - 1, each of which is,
in turn, defined in terms of n - I determinants of order n - 2, and so on; we finally
arrive at second-order determinants, in which those submatrices consist of single entries
whose determinant is defined to be the entry itself.
From the definition it follows that we may expand D by any row or column, that is,
choose in (3) the entries in any row or column, similarly when expanding the Cjk's in (3),
and so on.
This definition is unambiguous, that is, yields the same value for D no matter which
columns or rows we choose in expanding. A proof is given in App. 4.
SEC. 7.7
309
Determinants. Cramer's Rule
Terms used in connection with determinants are taken from matrices. In D we have n 2
entries ajk, also n rows and n columns, and a main diagonal on which all, a22, . . . , ann
stand. Two terms are new:
is called the minor of ajk in D, and Cjk the cofactor of ajk in D.
For later use we note that (3) may also be written in terms of minors
Mjk
n
D =
(4a)
2: (- L)j+kajkMjk
(j = 1, 2, ... , or n)
k~l
n
D =
(4b)
2: (-l)j+kajkMjk
(k
= 1, 2, ... , or n).
j~l
E X AMP L E 1
Minors and Cofactors of a Third-Order Determinant
In (4) of the previous section the minors and cofactors of the entries in the first column can be seen directly.
For the entries in the second row the minors are
and the cofactors are C21 = -M2b C22 = +M22 , and C23 = -M23. Similarly for the third row-write these
down yourself. And verify that the signs in Cjk fonn a checkerboard pattern
+
+
+
+
E X AMP L E 2
•
+
Expansions of a Third-Order Determinant
3
D=
2
6
-1
o
0
=
+ 4) + 0(0 + 6)
1(12 - 0) - 3(4
=
-12.
This is the expansion by the first row. The expansion by the third colmlll is
D = 0
I
:I = 0 -
2
-I
12 + 0 = -12,
•
Verify that the other four expansions also give the value -12.
E X AMP L E 3
Determinant of a Triangular Matrix
-3
o
6
4
-1
2
:I
= - 3· 4 . 5 = -60.
Inspired by this, can you formulate a little theorem on determinants of triangular matrices? Of diagonal
matrices?
•
CHAP. 7
310
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
General Properties of Determinants
To obtain the value of a determinant (1), we can first simplify it systematically by
elementary row operations. similar to those for matrices in Sec. 7.3. as follows.
THEOREM 1
Behavior of an nth-Order Determinant under Elementary Row Operations
(a) Interchange of two rows multiplies the value (~f the determinant by -1.
(b) Addition of a multiple of a row to another roH' does not alter the value of the
determinant.
(e) Multiplication of a row by a IlOn;:.ero constant c multiplies the I'aille of the
detenninant by c. (This holds also when c = 0, but gives no longer an elementary
row operation.)
PROOF
(a) By induction. The statement holds for n = 2 because
I:
bld = ad -
dlb = bc _ ad.
but
bc,
We now make the induction hypothesis that (a) holds for detenninants of order n - I ~ 2
and show that it then holds for determinants of order 11. Let D be of order n. Let E be
obtained from D by the interchange of two rows. Expand D and E by a row that is not
one of those interchanged. call it the jth row. Then by (4a).
n
(5)
D
=
L
E =
(-I)j+kajkMjk'
L
(-l)j+kajkNjk
k=l
k=l
where Njk is obtained ti'om the minor Mjk of ajk in D by the interchange of those two
rows which have been interchanged in D (and which Njk must both contain because we
expand by another row!). Now these minors are of order 11 - I. Hence the induction
hypothesis applies and gives Njk = -Mjk . Thus E = -D by (5).
(b) Add c times Row i to Row j. Let i5 be the new determinant. Its entries in Row j are
+ CGik- If we expand i5 by this Row j, we see that we can write it as i5 = DI + cD2 ,
where DI = D has in Row j the ajk, whereas D2 has in that Row j the (/ik from the addition.
Hence D2 has aik in both Row i and Row j. Interchanging these two rows gives D2 back.
but on the other hand it gives -D2 by (a). Together D2 = -D2 = O. so that i5 = DI = D.
ajk
(e) Expand the determinant by the row that has been multiplied.
CAUTION!
E X AMP L E 4
det (cA)
= c n det A (not c det A). Explain why.
•
Evaluation of Determinants by Reduction to Triangular Form
Because of Theorem 1 we may evaluate determinants by reduction to triangular form. as in the Gauss elimination
for a matrix. For instance (with the blue explanations always referring to the precedillg determinallt)
D=
2
o
6
4
5
o
o
2
6
8
9
-3
-I
SEC 7.7
311
Determinants. Cramer's Rule
2
0
-4
6
0
5
9
-12
0
2
6
-1
0
8
3
10
2
0
-4
6
0
5
9
-12
0
0
2.4
3.8
R"v. 3 - 004 Row 2
0
0
-11.4
29.2
R"v. 4 - 1.6 Rov. 2
2
0
-4
6
0
5
9
12
0
0
2.4
0
0
-0
Row 2
2 Row I
Rov. -l
1.5 Rov. I
3.8
47.25
Row 4 + 4.75 Row 3
•
= 2·5·2.4· 47.25 = 1134.
THEOREM 2
Further Properties of nth-Order Determinants
(a)-(c) ill Theorem I hoLd also for coLumlls.
(d) Trallsposition leaves the value of a detenninanl unaLtered.
(e) A zero row or columll renders the value of a detennillant ~ero.
(f) Proportional rows or columlls render the value of a determinant ::.ero. In
particular, a detemlil1ant with two identical rows or columlls has the I'aille ~ero.
PROOF
(a)-(e) follow directly from the fact that a determinant can be expanded by any row
column. In (d), transposition is defined as for matrices, that is, the jth row becomes the
jth column of the transpose.
(f) If Row j = c times Row i, then D = CDb where Dl has Row j = Row i. Hence an
interchange of these rows reproduces Db but it also gives -D 1 by Theorem l(a). Hence
Dl = 0 and D = cD l = O. Similarly for columns.
•
It is quite remarkable that the important L:oncept of the rank of a matrix A, which is the
maximum number of linearly independent row or column vectors of A (see Sec. 7.4), can
be related to determinants. Here we may assume that rank A > 0 because the only matrices
with rank 0 are the zero matrices (see Sec. 7.4).
THEOREM 3
Rank in Terms of Determinants
= [ajk] has rank I' ~ I ~f and only if A has all I' X rSlliJ111atrix
with non::.ero detel71zinant, ~l'hereas eve0' square suiJmatrix with more than I' rows
that A has (or does IlOt have!) has determinant equal to zero.
In particular, if A is square, n X n, it has rank 11 if and ol1ly if
All 111 X n matrix A
detA
"* O.
312
CHAP. 7
PROOF
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
The key idea is that elementary row operations (Sec. 7.3) alter neither rank (by Theorem
1 in Sec. 7.4) nor the property of a determinant being nonzero (by Theorem 1 in this
section). The echelon fonn A of A (see Sec. 7.3) has r nonzero row vectors (which are
the first r row vectors) if and only if rank A = r. Let R be the r X r submatrix in the left
upper corner of A (so that the entries of R are in both the first r rows and r columns of A).
Now R is triangular, with all diagonal entries l'Jj nonzero. Thus, det R = r11 ... Ir,. =I=- O.
Also det R =I=- 0 for the corresponding r X r submatrix R of A because R results from R
by elementary row operations. Similarly, det S = 0 for any square submatrix S of r + I
or more rows perhaps contained in A because the corresponding submatrix S of A must
contain a row of zeros (otherwise we would have rank A ~ r + I), so that det S = 0 by
Theorem 2. This proves the theorem for an m X n matrix.
In particular. if A is square. n X n. then rank A = n if and only if A contains an 11 X n
submatrix with nonzero determinant. But the only such submatrix can be A itself. hence
detA =I=- O.
•
Cramer's Rule
Theorem 3 opens the way to the classical solution formula for linear systems known as
Cramer's rule 2 , which gives solutions as quotients of determinants. Cramer's rule is not
practical ill computations (for which the methods in Secs. 7.3 and 20.1-20.3 are suitable),
but is of theoretical interest in differential equations (Secs. 2.10, 3.3) and other theories
that have engineering applications.
THEOREM 4
Cramer's Theorem (Solution of Linear Systems by Determinants)
(a)
If a linear system qf n equatio/lS in the same /lumber of unknow/lS x I, • . . , Xn
(6)
has a nonzero coefficient determinant D = det A, the system has precisely one
solution. This solution is given by the f017nulas
(7)
xn =
(Cramer's rule)
where Dk is the determinant obtained from D by replacing in D the kth columll by
the column with the entries b I , . . . ,bn(b) Hence if the SYSTem (6) is homogeneous and D =I=- 0, it has only The Trivial
soluTion Xl = 0, X2 = 0, ... , Xn = O. If D = 0, the homogeneous system also has
nontrivial solutions.
20ABRIEL CRAMER (1704--1752), Swiss mathematician.
SEC. 7.7
Determinants. Cramer's Rule
PROOF
The augmented matrix
at most n. Now if
(8)
A of the system (6) is of size n
X (n
+
1). Hence its rank can be
D=detA=
then rank A = n by Theorem 3. Thus rank A = rank A. Hence. by the Fundamental
Theorem in Sec. 7.5, the system (6) has a unique solution.
Let us now prove (7). Expanding D by its kth column, we obtain
(9)
where Cik is the cofactor of entry (lik in D. If we replace the entries in the kth column of
D by any other numbers, we obtain a new determinant, say, D. Clearly, its expansion by
the kth column will be of the form (9), with alk, . . . , (Ink replaced by those new numbers
and the cofactors Cik as before. In particular, if we choose as new numbers the entries
(Ill, • . • , (lnl of the lth column of D (where I
k), we have a new determinant D which
T
has twice the column [all
(lnzl • once as its lth column. and once as its kth
because of the replacement. Hence D = 0 by Theorem 2(f). [f we now expand b by the
column that has been replaced (the kth column). we thus obtain
*'
(10)
(l
We now multiply the first equation in (6) by Clk on both sides. the second by
the last by Cnk , and add the resulting equations. This gives
*' k).
C 2k , • • . •
(11)
Collecting terms with the same
Xj'
we can write the left side as
From this we see that Xk is multiplied by
Equation (9) shows that this equals D. Similarly,
Equation (10) shows that this is zero when I
simply xkD, so that (11) becomes
Xl
is multiplied by
*' k. Accordingly, the left side of (11) equals
CHAP. 7
314
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Now the right side of this is Dk as defined in the theorem. expanded by its kth column.
so that division by D gives 0). This proves Cramer's rule.
If (6) is homogeneous and D
0, then each Dk has a column of zeros, so that Dk = 0
by Theorem 2(e). and (7) gives the trivial solution.
Finally, if (6) is homogeneous and D = 0, then rank A < 11 by Theorem 3, so that
•
nontrivial solutions exist by Theorem 2 in Sec. 7.5.
"*
Illustrations of Theorem ..J. for 11 = 2 and 3 are given in Sec. 7.6. and an important
application of the present formulas will follow in the next section.
--PJlOBLEM3£T~
1. (Second-order detenninant) Expand a general secondorder determinant in four possible ways and show that
the results agree.
2. (Minors, cofactors) Complete the list of minors and
cofactors in Example 1.
3. (Third-order detenninant) Do the task indicated in
Example 2. Also evaluate D by reduction to triangular
form.
4. (Scalar multiplication) Show that det (kA) = k n det A
(not k det A), where A is any 11 X 11 matrix. Give an
example.
3
15.
81
-2
7.
6.
7
Icos a
sin
al
sin f3
cos
f3
0.3
0.8
0
0.5
2.6
0
0
-1.9
118
2
5
2
o
8
5
8-2
2
10.
o
o
3
-1
11. -3
0
-4
4
0
l/
13.
U
12.
l/
U
U
W
l/
6
7
8
-I
o
2
0
-4
-1
o
10
15
20
25
Time
0.004
sec
22
min
77
years
0.5' 109
years
18. 2x - 5y
=
23
4x + 6y
=
-2
2
-2
19.
2
2
-2
0
a
b
-a
0
c
-b
-c
0
3y
4x
20.
+
0
0
4
3
5
0
0
2
7
5
3w
14.
0
2
4
+ 4:::
2y -
y
+
=
14.8
=
-6.3
5z =
13.5
7
w +2x
2w
-2
0
5
118-201 CRAMER'S RULE
Solve by Cramer's rule and check by Gauss elimination and
back substitution. (Show details.)
w
w
4
11
x -
0
0-2
16.
cos 118
14
8.
70.4
9.
I -sin
o
o
2
o
-2
17. (Expansion numericallJ impractical) Show that the
computation of an nth-order determinant by expansion
involves n! multiplications, which if a multiplication
takes 10- 9 sec would take these times:
15-161 EVALUATION OF DETERMINANTS
Evaluate, showing the details of your work.
cos 118 sin 11 81
5.113
4
o
o
o
o
o
2
121-231
+
- 3::: = 30
4x - 5)"
+
8x - 4y
+
2:::
=
13
z = 42
+ y - 5;:
=
35
RANK BY DETERMINANTS
Find the rank by Theorem 3 (which is not a very practical
way) and check by row reduction. (Show details.)
SEC 7.8
21. [-:
(a) Line through two points. Derive from D = 0 in
(12) the familiar fonnula
-:J
y - Yl
22.ll~ -13 1:]
l-3
23.
[
5
-4
0.4
o
-2.4
1.2
0.6
o
3.0]
0.3
o
1.2
1.2
o
24. TEAM PROJECT. Geometrical Applications:
Curves and Surfaces Through Given Points. The
idea is to get an equation from the vanishing of
the detenninant of a homogeneous linear system as the
condition for a nontrivial solution in Cramer's theorem.
We explain the trick for obtaining such a system for
the case of a line L through two given points PI: (x 1> Y 1)
and P 2 : (X2, )'2)· The unknown line is ax + by = -c,
say. We write it as ax + by + c· I = O. To get a
nontrivial solution a, b, c, the determinant of the
"coefficients" x, y, I must be zero. The system is
(2)
7.8
315
Inverse of a Matrix. Gauss-Jordan Elimination
ax
+ by + c·
aXI
+ bYI + c·
I
o
o
o
(Line L)
(PIon L)
(P 2 on L).
(b) Plane. Find the analog of (12) for a plane through
three given points. Apply it when the points are (I, I, I),
(3, 2, 6), (5, 0, 5).
(c) Circle. Find a similar formula for a circle in the
plane through three given points. Find and sketch the
circle through (2. 6). (6. 4). (7. I).
(d) Sphere. Find the analog of the formula in (c) for
a sphere through four given points. Find the sphere
through (0, 0, 5), (4, 0, I), (0,4, I), (0, 0, 3) by this
formula or by inspection.
(e) General conic section. Find a fonnula for a
general conic section (the vanishing of a detenninant
of 6th order). Try it out for a quadratic parabola and
for a more general conic section of your own choice.
25. WRITING PROJECT. General Properties of
Determinants. Illustrate each statement in Theorems
I and :2 with an example of your choice.
26. CAS EXPERIMENT. Determinant of Zeros and
Ones. Find the value of the determinant of the n X n
matrix An with main diagonal entries all 0 and all others
I. Try to find a formula for this. Try to prove it by
induction. Interpret A3 and ~ as "incidence lI1i1frices"
(as in Problem Set 7.1 but without the minuses) of a
triangle and a tetrahedron, respectively; similarly for
an un-simplex". havingn vertices andn(n- l)/2edges
(and spanning R"-I, 11 = 5,6, ... ).
Inverse of a Matrix.
Gauss-Jordan Elimination
In this section we consider square matrices exclusively.
The inverse of an n X n matrix A = [ajk] is denoted by A -1 and is an 11
such that
X
n matrix
(1)
where I is the n X 11 unit matrix (see Sec. 7.2).
If A has an inverse, then A is called a nonsingular matrix. If A has no inverse, then
A is called a singular matrix.
If A has an inverse, the inverse is unique.
Indeed, if both Band C are inverses of A, then AB = I and CA = I, so that we obtain
the uniqueness from
B = IE
= (CA)B
=
CCAB) = CI = C.
CHAP. 7
316
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
We prove next that A has an inverse (is nonsingular) if and only if it has maximum
possible rank n. The proof will also show that Ax = b implies x = A -lb provided A-]
exists. and will thus give a motivation for the inverse as well as a relation to linear systems
(But this willilot give a good method of solving Ax = b Illlmerically because the Gauss
elimination in Sec. 7.3 requires fewer computations.)
THEOREM 1
Existence of the Inverse
The inverse A-I of an n X n matrix A exists if and only if rank A = n, thus (by
Theorem 3, Sec. 7.7) if and onZy if det A O. Hence A is nonsingular if rank A = n,
and is singular if rank A < n.
"*
PROOF
Let A be a given
/1
X n matrix and consider the linear system
(2)
Ax
= h.
If the inverse A-I exists, then multiplication from the left on both sides and use of (1)
gives
A-lAx
= x = A-lb.
This shows that (2) has a unique solution x. Hence A must have rank /1 by the Fundanlental
Theorem in Sec. 7.5.
Conversely, let rank A = n. Then by the same theorem, the system (2) has a unique
solution x for any b. Now the back substitution following the Gauss elimination (Sec. 7.3)
shows that the components Xj of x are linear combinations of those of b. Hence we can
write
x
(3)
= Bb
with B to be determined. Substitution into (2) gives
Ax
for any b. Hence C
get
= A(Bb) = (AB)b = Cb = b
AB)
= AB = I, the unit matrix. Similarly, if we substitute (2) into (3) we
x
for any x (and b
(C =
= Bb = B(Ax) = (BA)x
= Ax). Hence BA = I. Together, B = A-I exists.
•
3 WILHELM JORDAN (IR42-1899), German mathematician and geodesist. [See American Mathematical
Monthly 94 (1987). 130-142.]
We do not recommend it as a method for solving sy~tems of linear equations, since the number of operations
in addition to those of the Gauss elimination is larger than that for back substitution, which the Gauss-Jordan
elimination aVOlds. See also Sec. 20.1.
SEC. 7.8
Inverse of a Matrix. Gauss-Jordan Elimination
317
Determination of the Inverse
by the Gauss-Jordan Method
For the practical determination of the inverse A-I of a nonsingular n X 11 matrix A we
can use the Gauss elimination (Sec. 7.3), actually a variant of it, called the Gauss-Jordan
elimination3 (footnote of p. 316). The idea of the method is as follows.
Using A, we form n linear systems
e(n) are the columns of the 11 X n unit matrix I; thus,
where eel),
em = [\ 0
O]T, e(2) = [0 I 0
OlT, etc. These are 11 vector equations
in the unknown vectors xm, ... , x(n)' We combine them into a single matrix equation
AX = I, with the unknown matrix X having the columns xm'···. x(n)'
Correspondingly, we combine the n augmented matrices [A em],"', [A e(n)] into
one n X 2n "augmented matrix" A = [A I]. Now multiplication of AX = I by A- 1
from the left gives X = A -II = A -1. Hence, to solve AX = I for X, we can apply the
Gauss elimination to A = [A I]. This gives a matrix of the form [U H] with upper
triangular U because the Gauss elimination triangularizes systems. The Gauss-Jordan
method reduces U by further elementary row operations to diagonal form, in fact to the
unit matrix I. This is done by eliminating the ennies of U above the main diagonal and
making the diagonal entries all 1 by multiplication (see the example below). Of course,
the method operates on the entire matrix rU Hl, transforming H into some matrix K,
hence the entire [U H] to [I K]. This is the "augmented matrix" of IX = K. Now
IX = X = A -t, as shown before. By comparison. K = A -t, so that we can read A- J
directly from [I K].
The following example illustrates the practical details of the method.
E X AMP L E 1
Inverse of a Matrix. Gauss-Jordan Elimination
Determine the inverse A-I of
A
=
[-~
~l
-1
-1
3
4
Solution.
We apply the Gauss elimination (Sec. 7.3) to the following n X 2n
always refers to the previous matrix.
3 X
n matrix, where BLUE
o
2
4
=
o
o
0
o
2
7
3
2
-]
Row 2 + 3 Row]
0
Row 3 - Row]
o
2
7
3
-5
-4
-I
Row 3 - Row 2
318
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
This is IV H] as produced by the Gauss elimination. Now follow the additional Gauss-Jordan steps. reducing
U to I, that is. to diagonal form with entries I on the main diagonal.
[~
-I
-2
-I
1.5
0.5
0.8
O.:!
0
0.6
0.4
0
1.3
-0.2
0.8
0.2
0
-0.7
0.2
0
-1.3
-0.2
0.8
0.2
3.5
0
-I
[:
0
0
[:
Row I
0
0
-0:]
-0']
0.7
0.5 Row 2
-0.2 Row 3
Rov, I
2 Rov, 3
Rov, 2 - 3.5 Row .3
-0.2
03]
Rov, I
+ Row 2
0.7
-0.2
The last three columns constitute A -1. Check:
[-;
-I
-)
.3
2] [-07
0.2
L
-1.3
-0.2
4
0.8
0.2
03] ['
0.7
=
-0.2
0
0
0
0
:l
•
Hence AA- 1 = I. Similarly, A- 1 A = I.
Useful Formulas for Inverses
The explicit formula (4) in the following theorem is often useful in theoretical studies (as
opposed to computing inverses). In fact, the special case 11 = 2 occurs quite frequently in
geometrical and other applications.
THEOREM 2
Inverse of a Matrix
The inverse of a 110nsi11gular n X
(4)
A-I
11
matrix A = [ajk] is given by
_
I
= -I- [Cd T --
det A
J
Cl l
C21
Cnl
Cl2
C22
Cn2
C1n
C2n
Cnn
detA
where Cjk is the cofactor of ajk in det A (see Sec. 7.7). (CAUTION! Note well that
in A -\ the cofactor Cjk occupies the same place as alrj (not ajk) does in A.)
III particular. the inverse of
(4*)
is
A-I =
detA
SEC. 7.8
Inverse of a Matrix. Gauss-Jordan Elimination
PROOF
319
We denote the right side of (4) by B and show that BA = I. We first write
(5)
and then show that G = I. Now by the definition of matrix multiplication and because of
the form of B in (4), we obtain (r AUTION! Csb not C ks )
(6)
Now (9) and (l0) in Sec. 7.7 show that the sum ( ... ) on the right is D = det A when
I = k, and is zero when I =1= k. Hence
1
detA
= - - detA = 1,
gkk
= 0
gkZ
(I
k),
=I=-
In particular, for n = 2 we have in (4) in the first row C ll =
the second row C 12 = -a2b C 22 = all' This gives (4*).
E X AMP L E 2
Inverse of a 2
x
=
-a 12
and in
•
2 Matrix
A-I _
~
[
10
E X AMP L E 3
C 21
a22,
-IJ
4
-2
=
[
3
0.4
-0.2
-O.IJ
•
0.3
Further Illustration of Theorem 2
Using (4), find the inverse of
Solution.
Cl l =
-1
-1
3
:]
4
We obtain detA = -1(-7) - 1'13 + 2·8 = 10, and in (4),
I-I
:1
3
CI2 =
_I -1
CI3
I-1 -'I
=
A = [-:
3
3
3
=
-7,
:1
=
=
8,
-13,
C2I =
-I~ :1
C22 =
=
1-1 :1
-1
1-1 ~I
-1
=
C23 = -
I-11
2,
C31 =
-2,
C32 = -
=
2,
C33 =
:1
=
1-1 :1
3,
3
=
1-1 11
-1
= -2,
3
7,
so that by (4), in agreement with Example 1,
-0.7
A-I
=
-1.3
0.2
-0.2
0.8
0.2
0.3]
0.7.
[
-0.2
•
CHAP. 7
320
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Diagonal matrices A = [ajk]' (/jk = 0 when j
Ojj =I=- O. Then A-I is diagonal, too, with entries
PROOF
=I=-
k. have an inverse if and only if all
l/onn.
1/(/11' • . . ,
For a diagonal matrix we have in (4)
ell
etc.
D
E X AMP L E 4
•
Inverse of a Diagonal Matrix
Let
A=
[
-0.5
0
OJ
0
4
O.
°
0
I
Then the inverse is
o
0.25
o
•
Products can be inverted by taking the inverse of each factor and mUltiplying these
inverses in reverse order,
(7)
Hence for more than two factors,
(8)
PROOF
The idea is to start from (I) for AC instead of A, that is, AC(Aq-1 = I, and mUltiply
it on both sides from the left, first by A -t, which because of A -IA = I gives
A-1AC(Aq-1
= C(Aq-1
= A-II = A-I,
and then multiplying this on both sides from the left, this time by C- l and by using
C-1C = I,
This proves (7). and from it. (8) follows by induction.
We also note that the inverse of the inverse is the given matrix, as you may prove,
(9)
•
SEC. 7.8
:m
Inverse of a Matrix. Gauss-Jordan Elimination
Unusual Properties of Matrix Multiplication.
Cancellation Laws
Section 7.2 contains warnings that some properties of matrix multiplication deviate from
those for numbers, and we are now able to explain the restricted validity of the so-called
cancellation laws [2.] and [3.] below, using rank and inverse, concepts that were not yet
available in Sec. 7.2. The deviations from the usual are of great practical importance and
must be carefully observed. They are as follows.
[1.] Matrix multiplication is not commutative, that is, in general we have
AB
=1=
BA.
[2.] AB = 0 does not generally imply A = 0 or B = 0 (or BA = 0); for example,
[~
[3.] AC = AD does not generally imply C = D (even when A
=1=
0).
Complete answers to [2.] and [3.] are contained in the following theorem.
THEOREM 3
Cancellation Laws
Let A, B, C be n X n matrices. Then:
(a) If rank A
= nand AB = AC, then B =
C.
(b) lfrank A = n, then AB = 0 implies B = O. Hence if AB = 0, but A
as well as B =1= 0, then rank A < n and rank B < n.
=1=
0
(c) If A is singular, so are BA and AB.
PROOF
(a) The inverse of A exists by Theorem 1. Multiplication by A-I from the left gives
A -lAB = A -lAC, hence B = C.
(b) Let rank A = n. Then A -1 exists, and AB = 0 implies A -lAB = B = O. Similarly
when rank B = n. This implies the second statement in (b).
(c l ) Rank A < n by Theorem 1. Hence Ax = 0 has nontrivial solutions by Theorem 2
in Sec. 7.5. Multiplication by B shows that these solutions are also solutions of BAx = 0,
so thaI rank (BA) < n by Theorem 2 in Sec. 7.5 and BA is singular by Theorem 1.
(c 2 ) AT is singular by Theorem 2(d) in Sec. 7.7. Hence B TAT is singular by part (c 1 ),
and is equal to (AB)T by (lOd) in Sec. 7.2. Hence AB is singular by Theorem 2(d) in
Sec. 7.7.
•
Determinants of Matrix Products
The detelminant of a matrix product AB or BA can be written as the product of the
determinants of the factors, and it is interesting that det AB = det BA, although AB =1= BA
in general. The corresponding formula (10) is needed occasionally and can be obtained
by Gauss-Jordan elimination (see Example 1) and from the theorem just proved.
311
CHAP. 7
THE 0 REM 4
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Determinant of a Product of Matrices
For am: n
X
n matrices A and B,
(10)
det (AB)
= del (BA) = det A det B.
If A or B is singular. so are AB and BA by Theorem 3(c), and (10) reduces to 0 = 0 by
Theorem 3 in Sec. 7.7.
Now let A and B be nonsingular. Then we can reduce A to a diagonal matrix A = [ajk]
by Gauss-Jordan steps. Under these operations, det A retains its value, by Theorem I in
Sec. 7.7, (a) and (b) [not (c)] except perhaps for a sign reversal in row interchanging when
pivoting. But the same operations reduce AB to AB with the same effect on det (AB).
Hence it remains to prove (10) for AB; written out,
PROOF
all
0
0
b ll
b I2
bIn
0
a22
0
b 2I
b 22
b 2n
0
0
ann
b nl
b n2
b nn
AB=
al1 b l1
al1 b I2
allb in
a22 b 2I
a22 b 22
a22 b 2n
annbnl
a nn b n2
annbnn
all
...
We now take the determinant det (AB). On the right we can take out a factor
from
the first row, a22 from the second, ... , ann from the nth. But this product
~2
ann
equals det A because A is diagonal. The remaining determinant is det B. This proves (10)
for det (AB), and the proof for det (BA) follows by the same idea.
•
all
This completes our discussion of linear systems (Secs. 7.3-7.8). Section 7.9 on vector
spaces and linear transformations is optional. Numeric methods are discussed in Secs.
20.1-20.4, which are independent of other sections on numerics .
.•
~
~-=-12!
-.-..
.......
... ..-._..
INVERSE
Fmd the inverse by Gauss-Jordan [or by (4*) if 11 = 2] or
state that it does not exist. Check by using (1).
1. [
1.20
4.64J
0.50
3.60
0.6
2. [
0.8
3. [ cos 28
0.8J
-0.6
sin 28]
-sin 28 cos 28
SEC. 7.9
Vector Spaces. Inner Product Spaces. Linear Transformations
-11
-I
S.
[-I; -:]
[: ]
29
6. [ -160
61
-55
-2
55
-21
19
0
7.
[:
[ J [
8.
4
9.
10.
0
0
2
11. [:
-1
4
7.9
10]
6
1:]
12.
[-;
13. (Triangular matrix) Is the inver~e of a triangular
matrix always triangular (as in Prob. 7)? Give reason.
14. (Rotation) Give an application of the matrix in Prob.
3 that makes the form of its inverse obvious.
15. (Inverse of the square) Verify (A2 r
A in Prob. 5.
2
1]
-I
4
:]
0
=
(A-If for
16. Prove the formula in Prob. 15.
17. (Inverse of the transpose) Verify (AT) -1
for A in Prob. 5.
[ 1-231
-4
-9]
-1
2
2
1
=
(A _1)T
18. Prove the formula in Prob. 17.
19. (Inverse of the inverse) Prove that (A -1)-1 = A.
20. (Row interchange) Same question as in Prob. 14 for
the matrix in Prob. 9.
8
0
323
Optional
19
EXPLICIT FORMULA (4) FOR THE
INVERSE
Formula (4) is generally not very practical. To understand
its use, apply it:
21. To Prob. 9.
22. To Prob. 4.
23. To Prob. 7.
Vector Spaces, Inner Product Spaces,
Linear Transformations Optional
In Sec. 7.4 we have Seen that special vector spaces arise quite naturally in connection
with matrices and linear systems, that their elements, called vectors, satisfy rules quite
similar to those for numbers [(3) and (4) in Sec. 7.1], and that they are often obtained as
spans (sets of linear combinations) of finitely many given vectors. Each such vector has
n real numbers as its compollents. Look this up before going on.
Now if we take all vectors with II real numbers as components ("real vectors"), we
obtain the very important realll-dimensional vector space Rn. This is a standard name
and notation. Thus, each vector in R n is an ordered n-tuple of real numbers.
Pat1icular cases are R2, the space of all ordered pairs (""vectors in the plane") and R 3,
the space of all ordered triples ("vectors in 3-space"). These vectors have wide applications
in mechanics, geometry, and calculus that are basic to the engineer and physicist.
Similarly, if we take all ordered n-tuples of complex numbers as vectors and complex
numbers as scalars, we obtain the compleJ!: vector space en, which we shall consider in
Sec. 8.5.
This is not alL There are other sets of practical interest (sets of matrices, functions,
transformations, etc.) for which addition and scalar multiplication can be defined in a
natural way so that they foml a "vector space". This suggests to create from the "COil crete
model" R n the "abstract cOllcept" of a "real vector space" V by taking the basic properties
(3) and (4) in Sec. 7.1 as axioms. These axioms guarantee that one obtains a useful and
applicable theory of those more general situations. Note that each axiom expresses a simple
property of R n or, as a matter of fact. of R3. Selecting good axioms needs experience and
is a process of trial and error that often extends over a long period of time.
324
DEFINITION
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Real Vector Space
A nonempty set V of elements a, b, ... is called a real vector space (or real linear
space), and these elements are called vectors (regardless of their nature, which will
come out from the context or will be left arbitrary) if in V there are defined two
algebraic operations (called vector addition and scalar multiplication) as follows.
I. Vector addition associates with every pair of vectors a and b of V a unique
vector of V, called the sum of a and b and denoted by a + b, such that the following
axioms are satisfied.
1.1 Commutativity. For any two vectors a and b of V,
a+b=b+a.
1.2 Associativity. For any three vectors u. v. w of V,
(u + v) + w = u + (v + w)
(written u + v + w).
1.3 There is a unique vector in V, called the zero vector and denoted by 0, such
that for every a in V,
a+O=a.
1.4 For every a in V there is a unique vector in V that is denoted by -a and is
such that
a
+
(-a)
= O.
II. Scalar multiplication. The real numbers are called scalars. Scalar
multiplication associates with every a in V and every scalar c a unique vector of V,
called the product of c and a and denoted by ca (or ac) such that the following
axioms are satisfied.
11.1 Distributivity. For every scalar c and vectors a and b in V,
c(a + b) = ca + cb.
11.2 Distributivity. For all scalars c and k and every a in V,
(c
+ k)a
= ca
+ ka.
11.3 Associativity. For all scalars c and k and every a in V,
c(ka)
= (ck)a
(written cka).
11.4 For every a in V,
la = a.
A complex vector space is obtained if, instead of real numbers, we take complex numbers
as scalars.
SEC. 7.9
325
Optional
Vector Spaces, Inner Product Spaces, Linear Transformations
Basic concepts related to the concept of a vector space are defined as in Sec. 7.4.
A linear combination of vectors a(l),"', a(m) III a vector space V is an
expression
(C1, .•. , C m any scalars).
These vectors form a linearly independent set (briefly, they are called linearly
independent) if
(1)
implies that C1 = 0, ... , Cm = O. Otherwise, if (1) also holds with scalars not all zero,
the vectors are called linearly dependent.
Note that (1) with 11l = I is ca = 0 and shows that a single vector a is linearly
independent if and only if a =F O.
V has dimension n, or is n-dimensional, if it contains a linearly independent set of n
vectors, whereas any set of more than n vectors in V is linearly dependent. That set of n
linearly independent vectors is called a basis for V. Then every vector in V can be written
as a linear combination of the basis vectors; for a given basis, this representation is unique
(see Prob. 14).
E X AMP L E 1
Vector Space of Matrices
The real 2 X 2 matrice, form a four-dimensional real vector space. A
ba~is
is
~J
~J
because any 2 X 2 matrix A = [ajkJ has a unique representation A = allB11 + 012B12 + 021B21 + 022B22'
Similarly. the real 111 X II matrices with fixed 111 and n form an mil-dimensional vector space. What is the
•
dimension of the vector space of all 3 X 3 skew-symmetric matrices'! Can you find a basis?
E X AMP L E 2
Vector Space of Polynomials
The set of all constant, linear, and quadratic polynomials in x together is a vector space of dimension 3 with
basis {I. x, .r 2 } under the usual addition and multiplication by real numbers because these two operations give
polynomials not exceeding degree 2. What is the dimension of the vector space of all polynomials of degree
not exceeding a given fixed n'! Can you find a basis?
•
If a vector space V contains a linearly independent set of 11 vectors for every n, no matter
how large, then V is called infinite dimensional, as opposed to a finite dimensional
(n-dimensional) vector space just defined. An example of an infinite dimensional vector
space is the space of all continuous functions on some interval [ll, b J of the x-axis, as we
mention without proof.
Inner Product Spaces
If a and b are vectors in Rn, regarded as column vectors, we can form the product a Tb.
This is a 1 X 1 matrix, which we can identify with its single entry, that is, with a number.
This product is called the inner product or dot product of a and b. Other notations for
it are (a, b) and a·b. Thus
aTb = (a, b) = a·b = [al' .. an]
[~1]
:
n
=
~ alb l = alb l
l=l
bn
+ ... +
anbn-
326
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
We now extend this concept to general real vector spaces by taking basic properties of
(a, b) as axioms for an "abstract inner product'" (a, b) as follows.
Real Inner Product Space
DEFINITION
A real vector space V is called a real inner product space (or real pre-Hilbert4
space) if it has the following property. With every pair of vectors a and b in V there
is associated a real number, which is denoted by (a, b) and is called the inner
product of a and b, such that the following axioms are satisfied.
I. For all scalars ql and q2 and all vectors a, b, c in V,
(Linearity).
II. For all vectors a and b in V.
(a, b)
= (b,
(Symmetry).
a)
III. For every a in V,
(a, a)
~
0,
(Positive-definiteness).
(a, a)
=0
if and only if
Vectors whose inner product is zero are called orthogonal.
The length or norm of a vector in V is defined by
lIall = Yea, a)
(2)
(~
0).
A vector of norm 1 is called a unit vector.
From these axioms and from (2) one can derive the basic inequality
(3)
(Callchy-Schwarz5 inequality).
From this follows
(Triangle inequality).
(4)
A simple direct calculation gives
(5)
lIa + bll
2
+ lIa - bll 2 =
2(
lIall
2
+ lib II 2)
(Parallelogram equality).
4DAVID HILBERT (1862-1943), great Gennan mathematician, taught at Konigsberg and Gottingen and was
the creator of the famous Gottingen mathematical schooL He is known for his basic work in algebra. the calculus
of variations. integral equations, functional analysis, and mathematical logic. His "Foundations of Geometry"
helped the axiomatic method to gain general recognition. His famous 23 problems (presented in 1900 at the
International Congress of Mathematicians in Paris) considerably influenced the development of modem
mathematics.
If V b finite dimensional. it is actually a so-called Hilbert :lpace; see Ref. [GR7], p. 73, listed in App. L
5 HERMANN AMANDUS SCHWARZ (1843-1921). Gennan mathematician, known by his work in complex
analysis (confonnal mapping) and differential geometry. For Cauchy see Sec. 2.5.
SEC. 7.9
Optional
Vector Spaces, Inner Product Spaces, Linear Transformations
E X AMP L E 3
327
n-Dimensional Euclidean Space
R n with the inner product
(6)
(where both a and b are column vectors) is called the n-dimensional Euclidean space and is denoted by En
or again simply by Rn. Axioms I-III hold, as direct calculation shows. Equation (2) gives the "Euclidean norm"
•
(7)
E X AMP L E 4
An Inner Product for Functions. Function Space
The set of all reaT-valued continuous functions I(x), g(x), ... on a given interval a ::'" x ::'" f3 is a real vector
space under the usual addition of functions and multiplication by scalars (real numbers). On this "function
space" we can define an inner product by the integral
{3
(f, g) =
(8)
{I(X) g(x) ,ll-.
Axioms I-ITT can be verified by direct calculation. Equation (2) gives the norm
(3
IIIII
(9)
Y(f, I)
=
=
•
{f"(X)2 d".
Our examples give a first impression of the great generality of the abstract concepts of
vector spaces and inner product spaces. Further details belong to more advanced courses
(on functional analysis. meaning abstract modern analysis; see Ref. [OR7] listed in App. 1)
and cannot be discussed here. Instead we now take up a related topic where matrices play
a central role.
Linear Transformations
Let X and Y be any vector spaces. To each vector x in X we assign a unique vector y in
y. Then we say that a mapping (or transformation or operator) of X into Y is given.
Such a mapping is denoted by a capital letter, say F. The vector y in Yassigned to a vector
x in X is called the image of x under F and is denoted by F(x) [or Fx, without parentheses J.
F is called a linear mapping or linear transformation if for all vectors v and x in X
and scalars c,
F(v
+
x) = F(v)
+ F(x)
(10)
F(cx) = cF(x).
Linear Transformation of Space R n into Space R m
From now on we let X = Rri and Y = RIn. Then any real m
X n
matrix A = [ajk] gives
a transformation of R n into Rnl,
y
(11)
Since A(u
+
x)
= Ax.
= Au + Ax and A(cx) = cAx, this transformation is linear.
We show that, conversely, every linear transformation F of R" into R'm can be given
in terms of an 111 X n matrix A, after a basis for R n and a basis for R m have been chosen.
This can be proved as follows.
328
CHAP. 7
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
Let em, . . . . e(n) be any basis for Rn. Then every x in R n has a unique representation
Since F is linear, this representation implies for the image F(x):
Hence F is uniquely determined by the images of the vectors of a basis for R". We now
choose for R n the "standard basis"
o
o
o
o
(12)
o
o
where e(j) has its jth component equal to 1 and all others O. We show that we can now
determine an I1l X n matrix A = [~jk] such that for every x in R n and image y = F(x) in R m ,
y = F(x) = Ax.
Indeed, from the image y(1)
=
F(e(1) of e(n we get the condition
o
o
\" 0)
.m
from which we can determine the first column of A, namely lIll = yill. (/21 = y~l), ... ,
= Y1~1)· Similarly, from the image of e(2) we get the second column of A, and so on.
This completes the proof.
•
amI
We say that A represents F, or is a representation of F, with respect to the bases for R'n
and Rm. Quite generally, the purpose of a "representation" is the replacement of one
object of study by another object whose properties are more readily apparent.
In three-dimensional Euclidean space £3 the standard basis is usually written eO) = i,
e(2) = j, e(3) = k. Thus.
(13)
j
SEC. 7.9
Optional
Vector Spaces, Inner Product Spaces, Linear Transformations
329
These are the three unit vectors in the positive directions of the axes of the Cartesian
coordinate system in space, that is, the usual coordinate system with the same scale of
measurement on the three mutually perpendicular coordinate axes.
E X AMP L E 5
Linear Transformations
Interpreted as transformations of Cartesian coordinates in the plane, the matrices
[1 OJ
1J
[0I 0
'
°
represent a reflection in the line x2 =
(when a> 1, or a contraction when
°<
E X AMP L E 6
[a OJ
[ . 1 OJ
°
-I'
°
-I'
I
a reflection in the xraxis, a reflection in the origin. and a stretch
a < 1) in the xrdirection, respectively.
•
Xl,
Linear Transformations
Our discussion preceding Example 5 is simpler than it may look at first sight. To see this, find A representing
the linear transfonnation that maps (Xl, X2) onto (~XI - 5X2' 3XI + 4X2)'
Solution.
Obviously, the transfonnation is
From this we can directly see that the matrix is
[ YIJ [23 -5J4 [XIJ
Check:
•
=
Y2
X2
If A in (11) is square, 11 X 11, then (11) maps R'n into Rn. If this A is nonsingular, so that
A -1 exists (see Sec. 7.8), then multiplication of (11) by A - I from the left and use of
A -lA = I gives the inverse transformation
x
(14)
=
A-1y.
It maps every y = Yo onto that x, which by (11) is mapped onto Yo. The inverse of a linear
transformation is itself linear. because it is given by a matrix, as (14) shows .
. .R 0 B L E M.
r=uJ
5.£"E7.:~'F
VECTOR SPACES
6. All vectors in
(Additional problems in Problem Set 7.4.)
Is the given set (taken with the usual addition and scalar
multiplication) a vector space? (Give a reason.) If your
answer is yes. find the dimension and a basis.
1. All vectors in
R3
2. All vectors in
VI -
4V2
+ V3
satisfying
5VI -
satisfying
= 0
R3
2VI
3v 2
+
+
2V3 =
3V2 -
0
V3
3. All 2 X 3 matrices with all entries nonnegative
4. All symmetric 3 x 3 matrices
S. All vectors in R 5 with the first three components 0
0,
R4
with
VI
+ V2
=
0,
V3 -
V4 =
I
7. All skew-symmetric 2 X 2 matrices
8. All n X n matrices A with fixed n and det A = 0
9. All polynomials with positive coefficients and degree
3 or less
10. All functions f(x)
constants a and b
=
a cos x + h sin x with any
11. All functions I(x) = (ax + b)e-X with any constants
a and b
12. All 2 X 3 matrices with the second row any multiple
of [4 0 -9]
CHAP. 7
330
Linear Algebra: Matrices, Vectors, Determinants. Linear Systems
13. (Different bases) Find three bases for R2.
14. (Uniqueness)
20. Y1 =
the
representation
v = c1a(1) + ... + cna(n) of any given vector in
an n-dimensional vector space V in terms of a given
basis a(1)' ... , a(n) for V is unique.
[IS-20 1
Show
that
Find the inverse transformation. (Show the details of your
work.)
X2
Y2 = 3X1 - x 2
17. Y1 =
3X1 -
X2
18. Y1 = 0.25x1
Y2 =
19. h =
.. ---.
.....
--.
INNER PRODUCT. ORTHOGONALITY
121-261
LINEAR TRANSFORMATIONS
16. Y1 = 5X1 -
)'2 =
...-.
Find the Euclidean nonn of the vectors
21. [4 2 -6]T
22. [0 -3 3 0 5 I]T
23. [16 -32 O]T
24. [~
2S. [0
I !
26. [~
-~
1
2f
0 -1
0
if
27. (Orthogonality) Show that the vectors in Probs. 21
and 23 are orthogonaL
28. Find all vectors v in R3 orthogonal to [2 0 I]T.
29. (Unit vectors) Find all unit vectors orthogonal to
[4 -3]T. Make a ~ketch.
30. (Triangle inequality) Verify (4) for the vectors in
Probs. 21 and 23.
TIONS AND PROBLEMS
1. What properties of matrix multiplication differ from
those of the multiplication of numbers? What about
division of matrices?
11. 9x - 3y = 15
2. Let A be a 50 x 50 matrix and B a 50 X 20 matrix.
Are the following expressions defined or not? A + B,
A 2 , B2, AB. BA. AAT. BTA. BTB, BBT, BTAB. (Give
reasons.)
3. How is matrix mUltiplication motivated?
12. -2x - 4y
4. Are there any linear systems without solutions? With
one solution? With more than one solution? Give simple
examples.
S. How can you give the rank of a matrix in tenns of row
vectors? Of column vectors? Of determinants?
6. What is the role of rank in connection with solving
linear systems?
7. What is the row space of a matrix? The column space?
The null space?
8. What is the idea of Gauss elimination and back
substitution?
9. What is the inverse of a matrix? When does it exist?
How would you determine it?
10. What is Cramer's rule? When would you apply it?
IU-191 LINEAR SYSTEMS
Find all solution~ or indicate that no solution exists. (Show
the details of your work.)
-l]T
5x
+ 4y
x
13. 3x
x
+
=
2y
48
+
+
7z = -6
16:;: =
3
+ 5y -
8z = ] R
+
3z =
2}' -
+
15. - 8 x
6y
12x
+
14. 5x -
6
=
-x
+
2
v = 13
6
6y =
2y
+ z
5x - 4)"
+
=
3z =
0
18. -x
+
4y - 2z =
5x - 4)' = 47
3x
+
4)"
17. 3x
6x
+
+
7y =
9y = 15
19. 7 x + 9)' - 14z =
-x - 3y +
2x
+
Y -
x - 2y
36
2z = -12
4z =
4
-]
z=-12
2x+3y-
3
=2
2y
+
16.
2z =
+ 4z
3x
lOy =
+
6z =
+ 2z
=
32
331
Summary of Chapter 7
120-301
CALCULATIONS WITH MATRICES AND
VECTORS
Calculate the following expressions (showing the details of
your work) or indicate why they do not exist, when
A
~ r:
2
IT r-:
[l b~
B=
18
-6
15
10
a=
2
0
3
-J
21. A - AT
+ B2
T
24. AA , ATA
23. det A, det B, det AB
26. Aa, aTA, aTAa
27. aTb, bTa, ab T
28. bTBb
29. aTB, ETa
22. A2
+
Find the inverse or state why it does not exist. (Show details.)
37.
38.
39.
40.
41.
42.
Of the coefficient matrix in Prob. 11
Of the coefficient matrix in Prob. 15
Of the coefficient matrix in Prob. 16
Of the coefficient matrix in Prob. 18
Of the augmented matrix in Prob. 14
Of the diagonal matrix with entries 3, -1, 5
143--451 NETWORKS
Find the currents in the following networks.
[]
20. AB, BA
30. O.I(A
INVERSE
137-@
12
II
32. In Frob. 12
33. In Prob. 17
35. In Prob. 19
36. In Prob. 18
---
3400 V 40 Q
100Q
131-361
RANK
Determine the ranks of the coefficient matrix and the
augmented matrix and state how many solutions the linear
system will have.
31. In Frob. 13
-
20U
220V
45.
34. In Prob. 14
3800 V
44.
13
25. O.2BBT
AT)(B - BT)
lOQ
43.
~1020 V
~540V
20Q
Linear Algebra: Matrices, Vectors, Determinants
Linear Systems of Equations
An m X n matrix A = [ajk] is a rectangular array of numbers or functions ("entries",
"elements") arranged in 111 horizontal rows and n vertical columns. If 111 = n, the
matrix is called square. A 1 X 11 matrix is called a row vector and an m X 1 matrix
a column vector (Sec. 7.1).
The sum A + B of matrices of the same size (i.e., both m X n) is obtained by
adding corresponding entries. The product of A by a scalar c is obtained by
multiplying each ajk by c (Sec. 7.1).
The product C = AB of an m X n matrix A by an r X p matrix B = [bjk ] is
defined only when r = n, and is the 111 X P matrix C = [Cjk] with entries
(1)
(row j of A times
column k of B).
332
CHAP. 7
Linear Algebra: Matrices. Vectors. Determinants. Linear Systems
This multiplication is motivated by the composItIOn of linear transfonnations
(Secs. 7.2, 7.9). It is associative, but is 1I0t commutative: if AB is defined, BA may
not be defined, but even if BA is defined, AB =t- BA in general. Also AB = 0 may
not imply A = 0 or B = 0 or BA = 0 (Secs. 7.2, 7.8). Illustrations:
[~ ~J [-:
=
[~ ~J
[-1 IJ [I2 ~J = [-: -:J
2] [:J = [11], [:J [I 2] = [:
J
[l
-:J
-1
:J.
The transpose AT of a matrix A = [ajk] is AT = [akj]; rows become columns and
conversely (Sec. 7.2t Here. A need not be square. If it is and A = AT, then A is called
symmetric; if A = _AT, it is called skew-symmetric. For a product. (AB)T = BTAT
(Sec. 7.2).
A main application of matrices concerns linear systems of equations
(2)
Ax
=
b
(Sec. 7.3)
(m equations in n unknowns Xl' . . . ,xn ; A and b given). The most important method
of solution is the Gauss elimination (Sec. 7.3), which reduces the system to
"triangular" form by elementary row operations. which leave the set of solutions
unchanged. (Numeric aspects and variants. such as Doolittle's and Cholesky's
methods. are discussed in Secs. 20.1 and 20.2)
Cramer's rule (Sees. 7.6, 7.7) represents the unknowns in a system (2) of n
equations in Il unknowns as quotients of determinants; for numeric work it is
impractical. Determinants (Sec. 7.7) have decreased in importance, but will retain
their place in eigenvalue problems, elementary geometry, etc.
The inverse A -I of a square matrix satisfies AA -I = A -IA = I. It exists if and
only if det A =t- O. It can be computed by the Gauss-Jordan elimination (Sec. 7.8).
The rank r of a matrix A is the maximum number of linearly independent rows
or columns of A or, equivalently, the number of rows of the largest square submatrix
of A with nonzero determinant (Secs. 7.4. 7.7).
The system (2) has solutions if and only if rank A = rank [A b], where [A b]
is the augmented matrix (Fundamental Theorem, Sec. 7.5).
The homogeneous system
(3)
Ax = 0
has solutions x =t- 0 ("nontrivial solutions") if and only if rank A <
m = n equivalently if and only if det A = 0 (Secs. 7.6. 7.7).
11,
in the case
Vector spaces, inner product spaces, and linear transformations are discus'ied in
Sec. 7.9. See also Sec. 7.4.
.,J
-------.j' :.;-" . .
·f . - '
! C HAP T E R
~ ~;/
8
"
Linear Algebra:
Matrix Eigenvalue Problems
Matrix eigenvalue problems concern the solutions of vector equations
(1)
Ax = AX
where A is a given square matrix and vector x and scalar A are unknown. Clearly, x = 0
is a solution of (I), giving 0 = O. But this of no interest, and we want to find solution
vectors x*-O of (l), called eigenvectors of A. We shall see that eigenvectors can be
found only for certain values of the scalar A: these values A for which an eigenvector
exists are called the eigenvalues of A. Geometrically, solving (1) in this way means that
we are looking for vectors x for which the multiplication of x by the matrix A has the
same effect as the multiplication of x by a scalar A, giving a vector Ax with components
proportional to those of x, and A as the factor of proportionality.
Eigenvalue problems are of greatest practical interest to the engineer, physicist, and
mathematician, and we shall see that their theory makes up a beautiful chapter in linear
algebra that has found numerous applications.
We shall explain how to solve that vector equation (1) in Sec. 8.1, show a few typical
applications in Sec. 8.2, and then discuss eigenvalue problems for symmetric,
skew-symmetric. and orthogonal matrices in Sec. 8.3. In Sec. 8.4 we show how to obtain
eigenvalues by diagonalization of a matrix. We also consider the complex counterparts of
those matrices (Hermitian. skew-Hermitian. and unitary matrices, Sec. 8.5). which playa
role in modern physics.
COMMENT. Numerics for eigenvalues (Sees. 20.6-20.9) can be studied immediately
after this chapter.
Prerequisite: Chap. 7.
Sections that may be omitted il1 a shorter course: 8.4, 8.5
References and Answers to Problems: App. I Part B, App. 2.
333
CHAP. 8
334
8.1
Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues, Eigenvectors
From the viewpoint of engineering applications, eigenvalue problems are among the most
important problems in connection with matrices, and the student should follow the present
discussion with particular attention. We begin by defining the basic concepts and show how
to solve these problems, by examples as well as in general. Then we shall turn to applications.
Let A = [ajk] be a given Il X 11 matrix and consider the vector equation
(1)
Ax
= AX.
Here x is an unknown vector and A an unknown scalar. Our task is to determine x's and
A's that satisfy (I). Geometrically, we are looking for vectors x for which the multiplication
by A has the same effect as the multiplication by a scalar A; in other words, Ax should
be proportional to x.
Clearly. the zero vector x = 0 is a solution of (I) for any value of A. because AO = O.
This is of no interest. A value of A for which (I) has a solution x =1= 0 is called an eigenvalue
or characteristic value (or latent root) of the matrix A. ("Eigen" is German and means
"proper" or "characteristic.") The corresponding solutions x =1= 0 of (l) are called the
eigenvectors or characteristic vectors of A corresponding to that eigenvalue A. The set
of all the eigenvalues of A is called the spectrum of A. We shall see that the spectrum
consists of at least one eigenvalue and at most of n numerically different eigenvalues. The
largest of the absolute values of the eigenvalues of A is called the spectral radius of A,
a name to be motivated later.
How to Find Eigenvalues and Eigenvectors
The problem of determining the eigenvalues and eigenvectors of a matrix is called an
eigenvalue problem. (More precisely: an algebraic eigenvalue problem, as opposed to
an eigenvalue problem involving an ODE, POE (see Sees. 5.7 and 12.3) or integral
equation.) Such problems occur in physical, technical, geometric, and other applications,
as we shall see. We show how to solve them, first by an example and then in general.
Some typical applications will follow afterwards.
E X AMP L E 1
Determination of Eigenvalues and Eigenvectors
We illustrate all the steps in terms of the matrix
A [-5 2J.
=
2
Soluti01l.
-2
tal Eige1lvalues. These must be determined first. Equation (1) is
Ax = [52 -22J [X1J = A [X1J ;
-"2
Transferring the
term~
in components.
X2
on the right to the left. we get
(-5 - A)xl
+
=0
(2*)
2X1
This can be written in matrix notation
+ (-2 -
A)X2 =
O.
SEC. 8.1
335
Eigenvalues, Eigenvectors
(A - AI)x = 0
(3*)
because (1) is Ax - Ax = Ax - Alx = (A - M)x = 0, which gives (3*). We see that this is a homogeneous
linear system. By Cramer's theorem in Sec. 7.7 it has a nontrivial solution x
0 (an eigenvector of A we are
looking for) if and only if its coefficient determinant is zero, that is,
'*
(4*)
2
1=(-5-,\)(-2-A)-4=A2+7,\+6=0.
-2 - A
D(A)=detCA-AI)=I-S-A
2
We call D(A) the characteristic detenninant or, if expanded, the characteristic polynomial, and D(A) = 0
the characteristic equation of A. The solutions of this quadratic equation are Al = -I and A2 = -6. These
are the eigenvalues of A.
(hI) Eigellvector of A correspollding to AI' This vector is obtained from (2*) with A = Al = -I, that is,
A solution is x2 = 2~'1> as we see from either of the two equations, so that we need only one of them. ThIs
detennines an eigenvector corresponding to Al = -I up to a scalar multiple. If we choose Xl = 1. we obtain
the eigenvector
Check:
AXI =
[-5 2J [IJ [-IJ
2
-2
2
-2
= (-I1xI = Alxl'
(b2 ) Eigenvector of A correspollding to A2 • For A = A2 = -6. equation (2*) becomes
Xl
2xI
A solution is x2 = - xI I2 with arbitrary
corresponding to A2 = - 6 is
Xl'
+
2x2
=
0
+ 4x2
=
O.
If we choose Xl = 2, we get
X2
= -1. Thus an eigenvector of A
Check:
This example illustrates the general ca'ie as follows. Equation (I) written in components is
Transferring the tenns on the right side to the left side, we have
(2)
+
+
(a22 - A)X2
+ ... +
=0
+ ... +
=0
+ ... +
In matrix notation,
(3)
(A - AI)x
= o.
(ann - A)Xn
= O.
CHAP. 8
336
Linear Algebra: Matrix Eigenvalue Problems
By Cramer's theorem in Sec. 7.7, this homogeneous linear system of equations has a
nontrivial solution if and only if the corresponding determinant of the coefficients is zero:
(4) D(A) = det (A - AI) =
A
a l2
al n
a21
a22 - A
a2n
anI
a n2
all -
= O.
ann - A
A - AI is called the characteristic matrix and D( A) the characteristic determinant of
A. Equation (4) is called the characteristic equation of A. By developing D(A) we obtain
a polynomial of nth degree in A. This is called the characteristic polynomial of A.
This proves the following important theorem.
THEOREM 1
Eigenvalues
The eif;envalues of a square matrix A are the roots of the characteristic equatioll
of A.
Hence an Il X n matrix has at least one eigellvalue and at most Il numerically
different eigellvalues.
(4)
For larger Il. the actual computation of eigenvalues will in general require the use
of Newton' s method (Sec. 19.2) or another numeric approximation method in
Secs. 20.7-20.9.
The eigenValues must be determined first. Once these are known, corresponding
eigenvectors are obtained from the system (2), for instance, by the Gauss elimination.
where A is the eigenvalue for which an eigenvector is wanted. This is what we did in
Example I and shall do again in the examples below. (To prevent misunderstandings:
numeric approximation methods (Sec. 20.8) may determine eigenvectors first.)
Eigenvectors have the following properties.
THEOREM 2
Eigenvectors, Eigenspace
Jfw and x are eigenvectors of a matrix A corresponding to the same eigenvalue A,
so are w + x (provided x
-w) and kxfor allY k
O.
*
*
Hence the eigenvectors correspollding to one and the same eigenvalue A of A,
together with 0, fOl1n a !'ector space (cf. Sec. 7.4), called the eigenspace of A
corresponding to that A.
PROOF
Aw = Aw and Ax = Ax imply A(w + x) = Aw + Ax = Aw
= k(Aw) = k(Aw) = A(kw); hence A(kw + ex) = A(kw
A(kw)
+
Ax = A(w
+ .fx).
+
x) and
•
In particular. an eigenvector x is detel1nined only up to a constant factor. Hence we can
normalize x, that is. multiply it by a scalar to get a unit vector (see Sec. 7.9). For
instance,
Xl
= [I
2JT in Example I has the length
"Xl"
=
V 12 + 22
[lIvS 21vSf is a normalized eigenvector (a unit eigenvector).
= vS; hence
SEC. 8.1
337
Eigenvalues. Eigenvectors
Examples 2 and 3 will illustrate that an n X n matrix may have n linearly independent
eigenvectors. or it may have fewer than n. In Example 4 we shall see that a real matrix
may have complex eigenvalues and eigenvectors.
E X AMP L E 2
Multiple Eigenvalues
Find the eigenvalues and eigenvectors of
A =
[-~
-1
Solutioll.
=:].
2
-2
0
For our matrix, the characteristic determinant gives the characteristic equation
g
-A -
The roots (eigenvalues of A) are Al
(Sec. 7.3) to the system (A - Al)x
matrix is
-7
A - AI = A-51 =
A2
+ 21A +
45 = O.
= 5, A2 = Ag = -3. To find eigenvectors, we apply the Gauss e1immation
= O. first with A = 5 and then with A = -3. For A = 5 the characteristic
2
It row-reduces to
[
-1
Hence it has rank 2. Choosing
Xg
= -1 we have
x2
= 2 from
-¥X2 -
~Xg
-7xI + 2X2 - 3xg = O. Hence an eigenvector of A coresponding to A = 5 is
For A = - 3 the characteristic matrix
A - AI
= A + 31 = [ 2
-I
: =:]
-2
Xl
= 0 and then xl = I from
= [I 2 _I]T.
row-reduce, to
3
Hence it has rank I. From Xl + 2Y2 - 3xg = 0 we have Xl = - i l2 + 3xg. Choosing x2 = 1. Xg = 0 and
= 0, Xg = 1, we obtain two linearly independent eigenvectors of A corresponding to A = -3 [as they must
exist by (5), Sec. 7.5. with rank = 1 and /I = 3],
x2
~ ~ [:]
and
x,~ [:1
•
The order M). of an eigenvalue A as a root of the characteristic polynomial is called the
algebraic multiplicity of A. The number 11l}. of linearly independent eigenvectors
corresponding to A is called the geometric multiplicity of A. Thus In}. is the dimension of
the eigenspace corresponding to this A. Since the characteristic polynomial has degree n,
the sum of all the algebraic multiplicities must equal n. In Example 2 for A = - 3 we have
111)0. = M)o. = 2. In general, 111)0. ~ MAo as can be shown. The difference 6..}. = M)o. - 11l}. is
called the defect of A. Thus 6..-3 = 0 in Example 2, but positive defects 6..}. can easily occur:
CHAP. 8
338
E X AMP L E 3
Linear Algebra: Matrix Eigenvalue Problems
Algebraic Multiplicity, Geometric Multiplicity. Positive Defect
The characteristic equation of the matrix
is
Hence A = 0 is an eigenvalue of algebraic multiplicity Mo = 2. But its geometric multiplicity is only tno = I,
since eigenvectors result from -Oxl + x2 = 0, hence x2 = 0, in the form [Xl 0] T. Hence for A = 0 the defect
is 6. 0 = 1.
Similarly, the characteristic equation of the matrix
3
det(A _ AI) = 1 - A
is
o
2
1 = (3 - A)2 = O.
3 - A
Hence A = 3 is an eigenvalue of algebraic multiplicity M3 = 2. but its geometric multiplicity is only tn3 = I,
since eigenvectors result from OXI + 2x2 = 0 in the form [Xl OlT.
•
E X AMP L E 4
Real Matrices with Complex Eigenvalues and Eigenvectors
Since real polynomials may have complex roots (which then occur in conjugate pairs), a real matrix may have
complex eigenvalues and eigenvectors. For instance. the characteristic equation of the skew-symmetric matrix
is
det(A-AI) = I-A
-I
II=A2 +1=0.
-A
It gives the eigenvalues Al = i (=v=I), A2 = -i. Eigenvectors are obtained from -ixl
iXl
+
x2 =
+
X2
= 0 and
0, respectively, and we can choose Xl = I to get
[J
•
and
In the next section we shall need the following simple theorem.
THEOREM 3
Eigenvalues of the Transpose
The transpose AT of a square matrix A has the same eigenvalues as A.
PROOF
Transposition does not change the value of the characteristic determinant, as follows from
Theorem 2d in Sec. 7.7.
•
Having gained a first impression of matrix eigenvalue problems, in the next section we
illustrate their importance with some typical applications.
11-251
EIGENVALUES AND EIGENVECTORS
Fmd the eigenvalues and eigenvectors of the following
matrices. (Use the given A or factors.)
1. [-2o 0]
0.4
4.
S. [5 -2J
9
-6
[~ ~J
SEC. 8.1
7.
o.s
[ 0.6
Eigenvalues, Eigenvectors
-0.6J
8.
O.S
10.
339
[~ ~J
e
sin e
COS
[
-sin
eJ
cos
e
20.
21.
:[ : -~ -:]
o
0
19
-1
o
0
-1
19
o
-2
2
-4
2
-2
4
o
o
2
2
-4
2
-6
4
o
,A = 4
o
4
-2
22.
~]
4
-1
-2
2
-2
3
, (A - 3)2
23.
24.
0.2
0.1]
1.0
1.5
o
3.5
0]
12
25.
, (A
+
1)2
-4
-1
17.
26. (Multiple eigenvalues) Find further 2 X 2 and 3 X 3
matrices with multiple eigenvalues. (See Example 2.)
o
18.
3
6
27. (Nonzero defect) Find further 2 X 2 and 3 X 3
matrices with positive defect. (See Example 3.)
12]
:
,A = 9
28. (Transpose) Illustrate Theorem 3 with examples of
your own.
29. (Complex eigenvalues) Show that the eigenvalues of
a real matrix are real or complex conjugate in pairs.
19.
30. (Inverse) Show that the inverse A -1 exists if and only
if none of the eigenvalues AI, ... , An of A is zero, and
then A-I has the eigenvalues lIAl>' .. , lIAn-
CHAP. 8
340
8.2
Linear Algebra: Matrix Eigenvalue Problems
Some Applications of Eigenvalue Problems
In this section we discuss a few typical examples from the range of applications of matrix
eigenvalue problems, which is incredibly large. Chapter 4 shows matrix eigenvalue
problems related to ODEs governing mechanical systems and electrical networks. To keep
our present discussion independent of Chap. 4, we include a typical application of that
kind as our last example.
E X AMP L E 1
Stretching of an Elastic Membrane
An elastic membrane in the Xlx2-plane with boundary circle X1 2
point P: (Xl, X2) goes over into the point Q: (Y1, Y2) given by
y
(I)
=
[Y1J = Ax = [5
3J [XIJ ;
3
Y2
5
+
X2
2
I (Fig. 158) is stretched so that a
in components.
x2
Find the prinCipal directions, that is, the directions of the position vector x of P for which the direction of the
position vector y of Q is the same or exactly opposite. What shape does the boundary circle take under this
deformation?
We are looking for vectors x ~uch that y
of an eigenvalue problem. In components, Ax = Ax IS
Solutioll.
5X1
+
=
(5 - A)X1
3x2 = '\'\:1
3x 1
+
Ax, this gives Ax
+
or
(2)
=
Ax. Since y
Ax, the equation
=0
+ (5 -
5x2 = A.\:2
=
A)X2 =
O.
The characteristic equation is
3
(3)
5-A
I
=
(5 - A)2 - 9 = O.
Its solutions are Al = 8 and '\'2 = 2. These are the eigenvalues of our problem. For A = Al = 8, our system
(2) becomes
Solution.\"2 = Xl'
for instance,
Xl =
Xl arbitrary,
X2 = I.
For A2 = 2, our system (2) becomes
Solution
-Xl.
X2 =
for instance. Xl = 1,
Xl arbitrary,
x2 =
-1.
We thus obtain as eigenvectors of A, for instance, [1 I]T corresponding to Al and [I
I]T corresponding to
A2 (or a nonzero scalar multiple of these). These vectors make 45° and 135 0 angles with the positive Xl-direction.
They give the principal directions. the answer to our problem. The eigenvalues show that in the principal
directions the membrane is stretched by factors 8 and 2, respectively; see Fig. 158.
Accordingly. if we choose the principal directions as directions of a new Cartesian "1"2-coordinate system.
say, with the positive lll-semi-axis in the first quadrant and the positive 112-senu-axis in the second quadrant of
the xlx2-system. and if we set III = rcos cP. "2 = rsin cP. then a boundary point of the unstretched circular
membrane has coordinates cos cP, sin cP. Hence. after the stretch we have
21
Since cos
(4)
2
cP +
sin
2
cP =
=
8 cos
cP.
:2 = 2 sin cb.
I. this shows that the deformed boundary is an ellipse (Fig. 158)
1.
•
SEC. 8.2
Some Applications of Eigenvalue Problems
341
/
/
/
/
Fig. 158.
E X AMP L E 2
Undeformed and deformed membrane in Example 1
Eigenvalue Problems Arising from Markov Processes
Markov processes as considered in Example 13 of Sec. 7.2 lead to eigenvalue problems if we ask for the limit
state of the precess in which the stale vecmr x is reproduced under the multiplication by the smchastic marrix
A governing the process. that is, Ax = x. Hence A should have the eigenvalue I, and x should be a corresponding
eigenvector. This is of practical interest because it shows the long-term tendency of the development modeled
by the process.
In that example,
A
=
0.7
0.1
0.2
0.9
For the transpose,
[
0.7
0.2
0.1
0.9
°
0.2
[
0.1
°
Hence AT has the eigenvalue I, and the same is true for A by Theorem 3 in Sec. 8.1. An eigenvector x of A
for A = I is obtained from
A
~ [~:::
I
=
U.l
0.1
~O.I
o
:.2] ,
1110
row-reduced to
~0.2
~1I30
°
Taking x3 = 1, we get x2 = 6 from ~x2/30 + x3/5 = 0 and then Xl = 2 frem ~3XI/1O + x2/1O = O. This
gives x = [2 6 I]T It means that in the long run. the ratio Commercial: Industrial: Residential will approach
2:6: I, provided that the probabilities given by A remain (about) the same. (We switched to ordinary fractions
to avoid rounding errors.)
•
E X AMP L E 3
Eigenvalue Problems Arising from Population Models. Leslie Model
The Leslie model describes age-specified population growth, as follows. Let the oldest age attained by the
females in some animal population be 9 years. Divide the population into three age classes of 3 years each. Let
the "Leslie matrix" be
2.3
(5)
°
0.3
where i llc is the average number of daughters born to a single female during the time she is in age class k, and
(j = 2, 3) is the fraction of females in age class j ~ I that will survive and pass into class j. (a) What is
the number of females in each cia" after 3, 6, 9 years if each class initially consists of 400 females? (b) For
what initial distribution will the number of females in each class change by the same proportion? What is this
rate of change?
lj,j_l
342
CHAP. 8
Solution.
Linear Algebra: Matrix Eigenvalue Problems
(a) Initially, x;o) = [400
400
2.3
~3)
= L,,<O) =
n
400]. After 3 years,
[0:'
0
04]
o
0.3
o
400
=
[1000]
240.
400
120
Similarly. after 6 years the number of females in each class is given by X;6) = (LX(3»T = [600 648 72]. and
after 9 years we have X;9) = (LX(6»T = [1519.2 360 194.4].
(b) Proportional change means that we are looking for a distribution vector x such thai Lx = Ax, where A
is the rate of change (growth if A > I, decrease if A< I). The characteristic equation is (develop the characteristic
determinant by the first column)
det (L - AI) = - A3 - 0.6( -2.3A - 0.3·0.4) = - A3
+ 1.38A + 0.072 = O.
A positive root is found to be (for instance. by Newton's method. Sec. 19.2) A = 1.2. A corresponding eigenvector
x can be detennined from the characteristic matrix
2.3
0.4]
-1.2
0.3
o .
say,
-\.2
X=[0.5]
0.125
where x3 = 0.125 is chosen, X2
0.5 then follows from 0.3x2 - 1.2<3 = 0, and Xl = 1 from
-1.2<1 + 2.3x2 + 0.4x3 = O. To get an initial population of 1200 as before, we multiply x by
1200/(1 + 0.5 + 0.125) = 738. Answer: Proportional growth of the numbers of females in the three classes
will occur if the initial values are 738, 369, 92 in classes I, 2, 3, respectively. The growth rate will be 1.2 per
3~
•
E X AMP L E 4
Vibrating System of Two Masses on Two Springs (Fig. 159)
Mass-spring systems involving several masses and springs can be treated as eigenvalue problems. For instance,
the mechanical system in Fig. 159 is governed by the system of ODEs
(6)
where Yl and Y2 are the displacements of the masses from rest. as shown in the figure, and primes denote
derivatives with respect to time t. In vector form, this becomes
(7)
y" =
[Y~J
= Ay = [-5
Y2
2
2J [YlJ.
-2
Y2
(Net change in
spring length
=Y 2 -Y 1 )
Y2
System in
static
equilibrium
Fig. 159.
System in
motion
Masses on springs in Example 4
SEC 8.2
Some Applications of Eigenvalue Problems
343
We try a vector solution of the form
(8)
This is suggested by a mechanical system of a single mass on a spring (Sec. 2.4), whose motion
exponential functions (and sines and cosines). Substitution into (7) gives
IS
given by
Dividing by e wt and writing w2 = A, we see that our mechanical system leads to the eigenvalue problem
Ax = Ax
(9)
From Example I in Sec. 8.1 we see that A has the eigenvalues Al = -I and A2
w = V=! = -::'::.i and Y=6 = -::'::.iV6, respectively. COlTesponding eigenvectors are
-6. Consequently,
and
(10)
From (8) we thus obtain the four complex solutions [see (10), Sec. 2.21
Xle±it = Xl(cost -::': . isint),
X2e±iV6t = x2 (cos
V6 t
-::'::. i sin V6 t).
By addition and subtraction (see Sec. 2.2) we get the four real solutions
Xl
co~
t.
Xl sin t.
X2 cos
V6 t,
x2 sin V6 t.
A general solution is obtained by taking a linear combination of these.
with arbitrary constants aI, b l , a2. b 2 (to which values can be assigned by prescribing initial displacement and
initial velocity of each of the two masses). By (10), the components of yare
Yl = al cos t + b l sin t + 2a2 cos
Y2
= 2al cos t
+ 2b l
V6 t +
sin t - a2 cos
V6 t
2b2 sin
V6 t
- b 2 sm V6 t.
These functions describe harmonic oscillations of the two masses. Physically, this had to be expected because
we have neglected damping.
•
•••
11-61
-
LINEAR TRANSFORMATIONS
Find the matrix A in the indicated linear transformation
y = Ax. Explain the geometric significance of the
eigenvalues and eigenvectors of A. Show the details.
1. Reflection about the y-axis in R2
2. Reflection about the xy-plane in R3
17-141
7.
3. Orthogonal projection (perpendicular projection) of R2
onto the x-axis
4. Orthogonal projection of R3 onto the plane y = x
9.
the origin in R2
[:
[2.S
I.S
5. Dilatation (uniform stretching) in R2 by a factor S
6. Counterclockwise rotation through the angle 7r12 about
ELASTIC DEFORMATIONS
Given A in a deformation y = Ax, find the principal
directions and corresponding factors of extension or
contraction. Show the details.
11.
:J
1.SJ
6.S
[~ V:J
8.
[0.4
0.8
0.8J
0.4
[:
I;J
12. [S
2
l~J
10.
344
CHAP. 8
13. [-2
3
3J
-2
Linear Algebra: Matrix Eigenvalue Problems
14. [
10.5
1IY2]
lfY2
10.0
20.
[
0.6
0.1
0~4
0.1
15. (Leontief1 input-output model) Suppose that three
industries are interrelated so that their outputs are used
as inputs by themselves, according to the 3 X 3
consumption matrix
0.5
0.2
A = [ajk] =
°
0.6
[
0.2
0.5
where ajk is the fraction of the output of industry k
consumed (purchased) by industry j. Let Pj be the price
charged by industry.i for its total output. A problem is
to find prices so that for each industry, total
expenditures equal total income. Show that this leads
to Ap = p, where p = [PI PZ P3]T, and find a
solution p with nonnegative PI, Pz, P3'
16. Show that a consumption matrix as considered in Prob.
15 must have column sums 1 and always has the
eigenvalue I.
17. (Open Leontief input-output model) If not the whole
output but only a portion of it is consumed by the
industries themselves, then instead of Ax = x (as in
Prob. 15), we have x - Ax = y, where x = [Xl Xz X3]T
is produced, Ax is consumed by the industries, and, thus,
y is the net production available for other consumers.
Find for what production x a given demand vector
y = [0.136 0.272 0.136]T can be achieved if the
consumption matrix is
0.2
A =
0.3
[
0.2
0.4
°
U.4
0.2]
0.1 .
0.5
118-201 MARKOV PROCESSES
Find limit states of the Markov processes modeled by the
following matrices. (Show the details.)
18.
[0.1
0.9
19.
[05
0.3
U.5
0.2
0.2
0.4
0.4
0.8
POPULATION MODEL WITH AGE
SPECIFICATION
Find the growth rate in the Leslie model (see Example 3)
with the matrix as given. (Show details.)
3.45
21. [0: °
0.45
12.0
n.
[0:5
°
0.30
7.280
U
[0:60
°
0.420
24. TEAM PROJECT. General Properties of
Eigenvalues and Eigenvectors. Prove the following
statements and illustrate them with examples of your
own choice. Here, A]o ... , An are the (not necessarily
distinct) eigenvalues of a given /J X 11 matrix A = [ajk].
(a) Trace. The sum of the main diagonal entries is called
the trace of A. It equals the sum of the eigenvalues.
(b) "Spectral shift." A - kI has the eigenvalues
Al - k, ... , An - k and the same eigenvectors as A.
(c) Scalar multiples, powers. kA has the eigenvalues
I..A1 , . . . • kA.,. Am (Ill = I. 2.... ) has the eigenvalues
At'. .... k,,"'. The eigenvectors are those of A.
(d) Spectral mapping theorem. The ''polynomial
matrix"
has the eigenvalues
O.4J
0.6
0.3
0.2]
p(Aj ) = kmA/"
02]
0.2
0.6
+
k",_IA/n -
1
+ ... + k1Aj +
ko
where.i = I,' .. , 11, and the same eigenvectors as A.
(e) Perron's theorem. Show that a Leslie matrix L with
positive lIZ' 113, Iz]o 13z has a positive eigenvalue. (This
is a special case of the famous Perron-Frobenius theorem
in Sec. 20.7, which is difficult to prove in its general form.)
lWASSILY LEONTIEF (1906-1999). American economist at New York University. For his input-output
analysis he was awarded the Nobel Prize in 1973.
SEC. 8.3
8.3
Symmetric, Skew-Symmetric, and Orthogonal Matrices
345
Symmetric, Skew-Symmetric,
and Orthogonal Matrices
We consider three classes of real square matrices that occur quite frequently in applications
because they have several remarkable properties which we shall now discuss. The first
two of these classes have already been mentioned in Sec. 7.2.
DEFINITIONS
Symmetric, Skew-Symmetric, and Orthogonal Matrices
A real square matrix A = [ajk] is called
symmetric if transposition leaves it unchanged,
AT = A,
(1)
thus
skew-symmetric if transposition gives the negative of A,
AT = -A,
(2)
thus
orthogonal if transposition gives the inverse of A,
(3)
E X AMP L E 1
Symmetric, Skew-Symmetric, and Orthogonal Matrices
The matrices
~
-::].
-20
0
are symmetric, skew-symmetric, and orthogonal, respectively, as you should verify. Every skew-synuuetric
matrix has all main diagonal entries zero. (Can you prove this?)
•
Any real square matrix A may be written as the sum of a symmetric matrix R and a
skew-symmetric matrix S, where
(4)
EXAMPLE 1
and
Illustration of Formula (4)
A
~
5
[;
']
3
-8
4
3
=
R + S =
3.5
[9C
3.0
35] + [ 0
1.5
3.5
-~.O
-1.5
3.5
-2.0
3.0
1.5
6.0
0
-15]
-6.0
0
•
346
CHAP. 8
Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues of Symmetric and Skew-Symmetric Matrices
THEOREM 1
(a) The eigenvalues of a symmetric matrix are real.
(b) The eigenvalues of a skew-symmetric matrix are pure imaginal}' or zero.
This basic theorem (and an extension of it) will be proved in Sec. 8.5.
E X AMP L E 3
Eigenvalues of Symmetric and Skew-Symmetric Matrices
The matrices in (1) and (7) of Sec. 8.2 are symmetric and have real eigenvalues. The skew-symmetric matrix
in Example 1 has the eigenvalues O. -25 i, and 25 i. (Verify this.) The following matrix has the real eigenvalues
1 and 5 but is not symmetric. Does this contradict Theorem I?
•
Orthogonal Transformations and Orthogonal Matrices
OrthogonaJ transformations are transformations
(5)
y
=
where A is an orthogonal matrix.
Ax
With each vector x in R n such a transformation assigns a vector y in Rn. For instance,
the plane rotation through an angle 0
y =
(6)
[Yl]
Y2
=
[C~s 0
sm e
-sin
cos
0]
[Xl]
e
X2
is an orthogonal transformation. It can be shown that any orthogonal transformation in
the plane or in three-dimensional space is a rotation (possibly combined with a reflection
in a straight line or a plane. respectively).
The main reason for the importance of orthogonal matrices is as follows.
THEOREM 2
In variance of Inner Product
An orthogonal transfonnation preserves the value of the inner product of vectors
a and bin Rn. defined by
(7)
That is, for any a and b in R n , orthogonaln X 11 matrix A, and u = Aa, v = Ab
we have u·v = a·b.
Hence the tra11Sfonnation also preserves the length or norm of any vector a in
Rn given by
(8)
II a II = v'a.3 = Wa.
SEC 8.3
347
Symmetric, Skew-Symmetric, and Orthogonal Matrices
PROOF
Let A be orthogonal. Let u = Aa and v = Ab. We must show that u· v = a· b. Now
(Aa)T = a TAT by (lOd) in Sec. 7.2 and ATA = A-I A = [by (3). Hence
(9)
From this the invariance of
II a II
follows if we set b
=
•
a.
Orthogonal matrices have further interesting properties as follows.
THEOREM 3
Orthonormality of Column and Row Vectors
A real square matrix is orthogonal if and only if its column vectors a1> ... , an (and
also its row vectors) form an orthonormal system, that is,
(10)
if j = k.
PROOF
(a) Let A be orthogonal. Then A-I A = ATA = I, in tenns of colullUl vectors a1> ... , an'
(11)
T
1
I=A- A=A A=
1T
a ]
lalTal
:T [al···a,.,]=
~
l an
an~
1
al T. a2 : : : a T.an ] .
a n Ta2' .. anTan
The last equality implies (0), by the definition of the n X n unit matrix I. From (3) it
follows that the inverse of an orthogonal matrix is orthogonal (see CAS Experiment 20).
Now the column vectors of A-I (= AT) are the row vectors of A. Hence the row vectors
of A also form an orthonormal system.
(b) Conversely, if the column vectors of A satisfy (10), the off-diagonal entries in (11)
must be 0 and the diagonal entries 1. Hence ATA = I, as (11) shows. Similarly, AAT = I.
This implies AT = A-I because also A- 1A = AA -1 = I and the inverse is unique. Hence
A is orthogonal. Similarly when the row vectors of A form an orthonormal system, by
•
what has been said at the end of part (a).
THEOREM 4
Determinant of an Orthogonal Matrix
The detenninant of an orthogonal matrix has the value
PROOF
+ 1 or -1.
From det AB = det A det B (Sec. 7.8, Theorem 4) and det AT = det A (Sec. 7.7, Theorem
2d), we get for an orthogonal matrix
I = det I = det (AA -1) = det (AA T) = det A det AT = (det A)2.
E X AMP L E 4
•
Illustration of Theorems 3 and 4
The last matrix in Example I and the matrix in (6) illustrare Theorems 3 and 4 because their determinants are
-1 and + 1. as you should verify.
•
348
CHAP. 8
THEOREM 5
Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues of an Orthogonal Matrix
The eigenvalues of an orthogonal matrix A are real or complex conjugates in pairs
and have absolute value I.
PROOF
E X AMP L E 5
The first part of the statement holds for any real matrix A because its characteristic
polynomial has real coefficients. so that its zeros (the eigenvalues of A) must be as
indicated. The claim that IAI = 1 will be proved in Sec. 8.5.
•
Eigenvalues of an Orthogonal Matrix
The orthogonal matrix in Example 1 has the characteristic equation
Now one of the eigenvalues must be real (why?). hence + I or -1. Trying. we find -1. Division by A + I
gives _(A2 - 5Al3 + 1) = 0 and the two eigenvalues (5 + iVll)/6 and (5 - iVll)/6. which have absolute
value I. Verify all of this.
•
Looking back at this section. you will find that the numerous basic results it contains have
relatively short, straightforward proofs. This is typical of large portions of matrix
eigenvalue theory .
. . 08 LE~--S-EI-83"
e
sin e
1. (Verification) Verify the statements in Example 1.
COS
2. Verify the statements in Examples 3 and 4.
3. Are the eigenvalues of A + B of the form Aj + Mj.
where A.i and p; are the eigenvalues of A and B,
respecti vely?
4. (Orthogonality) Prove that eigenvectors of a
symmetric matrix corresponding to different
eigenvalues are orthogonal. Give an example.
11.
12.
e
cos e
-sin
[
°
°
13.
5. (Skew-symmetric matrix) Show that the inverse of a
skew-symmetric matrix is skew-symmetric.
6. Do there exist nonsingular skew-symmetric
matrices with odd II?
11
X
11
15.
7. (Orthogonal matrix) Do there exist skew-symmetric
orthogonal 3 X 3 matrices?
8. (Symmetric matrix) Do there exist nondiagonal
symmetric 3 X 3 matrices that are orthogonal?
19-171
17.
EIGENVALUES OF SYMMETRIC, SKEWSYMMETRIC, AND ORTHOGONAL
MATRICES
Are the following matrices symmetric, skew-~ymmetric, or
orthogonal? Find their spectrum (thereby illustrating
Theorems 1 and 5). (Show the details of your work.)
9. [0.96
0.28
-0.28J
0.96
10.
[a
-b
bJ
a
18. (Rotation in space) Give a geometric interpretation of
the transformation y = Ax with A as in Prob. 12 and
x and y referred to a Cartesian coordinate system.
19. WRITING
PROJECT.
Section
Summary.
Summarize the main concepts and facts in this section,
with illustrative examples of your own.
SEC 8.4
Eigenbases. Diagonalization. Quadratic Forms
20. CAS EXPERIMENT. Orthogonal Matrices.
(a) Products. Inverse. Prove that the product of two
orthogonal matrices is orthogonaL and so is the inverse
of an orthogonal matrix. What does this mean in terms
of rotations?
(b) Rotation. Show that (6) is an orthogonal
transformation. Verify that it satisfies Theorem 3. Find
the inverse transformation.
(e) Powers. Write a program for computing powers
Am (17l = 1. 2.... ) of a 2 X 2 matrix A and their
8.4
349
spectra. Apply it to the matrix in Prob. 9 (call it A). To
what rotation does A correspond? Do the eigenvalues
of Am have a limit as 111 _ x?
(d) Compute the eigenvalues of (O.9A)"'. where A is
the matrix in Prob. 9. Plot them as points. What is their
limit? Along what kind of curve do these points
approach the limit?
(e) Find A such that y = Ax is a counterclockwise
rotation through 30° in the plane.
Eigenbases. Diagonalization.
Quadratic Forms
So far we have emphasized properties of eigenvalues. We now turn to general properties
of eigenl'ectors. Eigenvectors of an n X n matrix A may (or may not!) form a basis for
Rn. If we are interested in a transformation y = Ax, such an "eigenbasis" (basis of
eigenvectors)-if it exists-is of great advantage because then we can represent any x in
R n uniquely as a linear combination of the eigenvectors Xl' • . . , Xn , say,
And, denoting the corresponding (not necessarily distinct) eigenvalues of the matrix A by
AI' ... , An. we have AXj = A:iXj, so that we simply obtain
y
(I)
= Ax = A(clX I + ... + cnxn)
= ciAx i + ... + cnAx"
= cIAlx I + ... + cnAnxn·
This shows that we have decomposed the complicated action of A on an arbitrary vector
x into a sum of simple actions (multiplication by scalars) on the eigenvectors of A. This
is the point of an eigenbasis.
Now if the n eigenvalues are all different, we do ohtain a hasis:
Basis of Eigenvectors
THEOREM 1
If an n
X
Xl • . . . •
PROOF
n matrix A has 11 distillct eigenvalues, then A has a basis of eigenvectors
xnfor Rn.
All we have to show is that Xl' . • • • Xn are linearly independent. Suppose they are not.
Let r be the largest integer such that {Xl> . • • • x,.} is a linearly independent set. Then
r < n and the set {Xl.' ••• Xn xr+d is linearly dependent. Thus there are scalars
CI> ••• , Cr+I, not all zero, such that
(2)
(see Sec. 7.4). Multiplying both
(3)
side~
by
A
and using
AXj
= AjXj.
we obtain
350
CHAP. 8
Linear Algebra: Matrix Eigenvalue Problems
To get rid of the last term, we subtract Ar+l times (2) from this, obtaining
Here cl(A l - Ar+l) = O..... c,·(Ar - A,-+l) = 0 since {Xl' ... ,xrl is linearly independent.
Hence Cl = ... = cr = 0, since all the eigenvalues are distinct. But with this, (2) reduces
to c r +lxr +l = 0, hence Cr +l = 0, since xr+l 1:- 0 (an eigenvector!). This contradicts the fact
that not all scalars in (2) are zero. Hence the conclusion of the theorem must hold.
•
E X AMP L E 1
Eigenbasis. Nondistinct Eigenvalues. Nonexistence
The matrix A = [ :
:J
has a
ba~is of eigenvectors
[:
J' [-:J
corresponding to the eigenvalues
= 8. A2 = 2. (See Example I in Sec. 8.2.)
Even if not all n eigenvalues are different, a matrix A may still provide an eigenbasis for RTl. See Example
2 in Sec. 8.1, where n = 3.
On the other hand, A may not have enough linearly independent eigenvectors to make up a basis. For
instance, A in Example 3 of Sec. 8.1 is
Al
~J
and has only one eigenvector
(k
*-
O. arbitrary).
•
Actually, eigenbases exist under much more general cunditions than those in Theorem L.
An important case is the following.
THEOREM 2
Symmetric Matrices
A symmetric matrix has all ortllOno1711al basis of eigellvectors for Rn.
For a proof (which is involved) see Ref. [B3], vol. I, pp. 270-272.
E X AMP L E 2
Orthonormal Basis of Eigenvectors
The first matrix in Example I i~ symmetric, and an orthonormal basis of eigenvectors is [IIV:>:
[1IV2 -ltif2:r
Diagonalization of Matrices
Eigenbases also playa role in reducing a matrix A to a diagonal matrix whose entries are
the eigenvalues of A. This is done by a "similarity transformation." which is defined as
follow,> (and will have various applications in numerics in Chap. 20).
DEFINITION
Similar Matrices. Similarity Transformation
An n X
11
matrix
A is called similar to an 11
X
11
matrix A if
(4)
for some (nonsinguiar!) 11 X n matrix P. Thi~ transformation. which gives
A. is called a similarity transformation.
A from
SEC 8.4
)51
Eigenbases. Diagonalization. Quadratic Forms
The key property of this transformation is that it preserves the eigenvalues of A:
THEOREM)
Eigenvalues and Eigenvectors of Similar Matrices
If A is similar to A, then A !las the same eigenvalues as A.
Furthennore, {f x is all eigenvector of A, then y = p-1x is an eigenvector of A
corresponding to the same eigenvalue.
PROOF
From Ax = Ax (A an eigenvalue, x *- 0) we get P-1Ax = AP-1x. Now 1= pp-l. By
this "identity trick" the previous equation gives
Hence A is an eigenvalue of A and p-1x a corresponding eigenvector. Indeed, p-1x = 0
would give x = Ix = pp-1x = PO = 0, contradicting x *- O.
•
E X AMP L E)
Eigenvalues and Vectors of Similar Matrices
A= [6 -3J
Let
4
A= [
Then
and
-I
p= [I 3J
I
4
4 -3JI [Ii4 -3J
[I1 43J [30 OJ.
-I
2
-I
Here p-l was obtained from (4*) in Sec. 7.8 with det P = 1. We see that A has the eigenvalues Al = 3,
A2
The characteristic equation of is
A)(
A) +
A2 - 5A +
It has the roots (the
eigenvalues of A) Al = 3, A2 = 2, confirming the first part of Theorem 3.
We confirm the second part. From the first component of (A - AI)x = 0 we have (6 - A)XI - 3"2 = O.
For A = 3 this gives 3.'1 - 3X2 = O. say. Xl = [1 l]T. For A = 2 it gives 4xl - 3X2 = O. say. X2 = [3 4]T.
In Theorem 3 we thus have
=2.
A l6 - -1 -
12 =
6=O.
Indeed. these are eigenvectors of the diagonal matrix A.
Perhaps we see that Xl and x2 are the columns of P. This suggests the general method of transforming a
matrix A to diagonal form D by using P = X, the matrix with eigenvectors as columns:
•
THEOREM 4
Diagonalization of a Matrix
If an n
X 11
matrix A has a basis of eigenvectors, then
(5)
is diagonal, with the eigenvalues of A as the entries on the main diagonal. Here X
is the matrix H·ith these eigenvectors as colu11ln vectors. Also,
(5*)
D m = X-1A"'X
(/11
= 2,3,' . ').
CHAP. 8
352
PROOF
Linear Algebra: Matrix Eigenvalue Problems
Let Xb ... , Xn constitute a basis of eigenvectors of A for Rn. Let the corresponding
eigenvalues of A be Ab ... , An' respectively, so that Ax] = A1Xb· .. ,AXn = Anx".
Then X = [Xl
x.,] has rank n, by Theorem 3 in Sec. 7.4. Hence X-I exists by
Theorem 1 in Sec. 7.8. We claim that
(6)
AX = A[X1
where D is the diagonal matrix as in (5). The fourth equality in (6) follows by direct
calculation. (Try it for n = 2 and then for general n.) The third equality uses AXk = Akxk.
The second equality results if we note that the first column of AX is A times the first
column of X, and so on. For instance. when n = 2 and we write Xl = [XU X21]T,
X2 = [X12 X22]T, we have
AX =
A[X1
X2]
[(/11
(/12]
(/21
(/22
[(/l1 X 11
(/21X 11
X12 ]
[X11
X 21
X22
+ (/12 X 21
+ (/22 X 21
(/l1X 12
+
(/21 X 12
+ (/22 X 22
Column I
(/12 X 22 ]
= [AX1 Ax2].
Column 2
If we multiply (6) by X-I from the left, we obtain (5). Since (5) is a similarity
transformation, Theorem 3 implies that D has the same eigenvalues as A. Equation (5*)
follows if we note that
•
etc.
E X AMP L E 4
Diagonalization
Diagonalize
7.3 0.2 -3.7]
A =
-11.5
1.0
5.5
17.7
1.8
-9.3
[
.
The characteristic determinant gives the characteristic equation -A3 - A2 + 12A = O. The roots
(eigenvalues of A) are Al = 3, A2 = -4, A3 = O. By the Gauss elimination applied to (A - AI)x = 0 with
A = A]. A2• A3 we find eigenvectors and then X-I by the Gauss-Jordan elimination (Sec. 7.8. Example 1). The
results are
Solution.
0.2 0.3]
-0.2
0.2
0.7.
-0.2
Calculating AX and multiplying by X-I from the left, we thus obtain
D = X-lAX =
[-0'
-1.3
-0.2
0.8
0.2
0.2
0'][-3
-4
0.7
9
4
-0.2
-3
-12
:] ~ [:
0
-4
0
:1
•
SEC. 8.4
353
Eigenbases. Diagonalization. Quadratic Forms
Quadratic Forms. Transformation to Principal Axes
By definition, a quadratic form Q in the components Xl, ... , Xn of a vector x is a sum
of n 2 terms, namely,
n
T
Q = x Ax =
n
2. 2. ajkxjxk
j~l k~l
(7)
+ ........................ .
A = [ajk] is called the coefficient matrix of the fmID. We may assume that A is symmetric,
because we can take off-diagonal telIDS together in pairs and write the result as a sum of
two equal terms; see the following example.
E X AMP L E 5
Quadratic Form. Symmetric Coefficient Matrix
Let
Here 4 t- 6 = 10 = 5 -t 5. From the corresponding symmetric matrix C
thus Cll = 3, C12 = C21 = 5, C22 = 2, we get the same result; indeed,
= [Cjk],
where
Cjk = ~(ajk
+
aid)'
•
Quadratic forms occur in physics and geometry. for instance. in connection with conic
sections (ellipses X12/a 2 + X22/b 2 = 1, etc.) and quadratic surfaces (cones, etc.). Their
transformation to principal axes is an important practical task related to the diagonalization
of matrices, as follows.
By Theorem 2 the symmetric coefficient matrix A of (7) has an orthonormal basis of
eigenvectors. Hence if we take these as column vectors, we obtain a matrix X that is
orthogonal, so that X-l = XT. From (5) we thus have A = XDX- 1 = XDXT. Substitution
into (7) gives
(8)
If we set XTx
(9)
= y, then, since XT = x-I, we get
x = Xy.
Furthermore, in (8) we have xTX = (XTX)T = yT and XTx = y, so that Q becomes simply
(10)
354
CHAP. 8
Linear Algebra: Matrix Eigenvalue Problems
This proves the following basic theorem.
THEOREM 5
Principal Axes Theorem
The substitution (9) transforms a quadratic form
n
Q
= xTAx =
n
2. 2.
ajkXjXk
j=l k=I
to the principal axes form or canonical form (10), where Ab ...• An are the (not
necessarily distinct) eigenvalues of the (s)"lnmetric!) matrix A, and X is an
orthogonal matrix with corresponding eigenvectors Xl' . . . , xn , respectively, as
colu11ln vectors.
E X AMP L E 6
Transformation to Principal Axes. Conic Sections
Find out what type of conic section the following quadratic form represents and transform it to principal
axes:
Solution.
We have Q = xTAx. where
A=[
I7
-15
-15J.
I7
This gives the characteristic equation 07 - A)2 - 15 2 = O. It has the roots A1 = 2. A2 = 32. Hence (10)
becomes
We see that Q = 128 represents the ellipse 2.1'/
+ 32yl
=
128. that is.
If we want to kno" the direction of the principal axes in the Xlx2-coordinates. we have to determine normalized
eigenvectors from (A - AI)x = 0 with A = Al = 2 and A = A2 = 32 and then use (9). We get
[ I/V~.
]
and
II\., 2
[-IIVzJ.
IIVz
hence
x = XJ' =
IIVz -IIVz J [YIJ
[IIVz
1IV2
.1'2'
Xl = yI/Vz - Y2 /Vz
X2 = y 1 /V2 + Y2/V2.
This is a 45° rotation. Our results agree with those in Sec. 8,2, Example I, except for the notations. See also
•
Fig. 158 in that example.
SEC 8.4
.... __ r------." ;...= ;...• ...
r
- - - - - ..
11-91
3.
5.
[:
====-_
~--
~
(d) Diagonalization. What can you do in (5) if you
want to change the order of the eigenvalues in D, for
instance. interchange d l l = Aland d22 = A2?
DIAGONALIZATION OF MATRICES
Find an eigenbasis (a basis
diagonalize. (Show the details.)
1.
355
Eigenbases. Diagonalization. Quadratic Forms
of eigenvectors)
2.
:J
[~ ~J
4.
[1.0 6.0J
1.5 1.0
6.
and
[~ I~J
[-:
-~J
SIMILAR MATRICES HAVE EQUAL
SPECTRA
113-181
Verify this for A and A = P-lAP. Find eigenvectors y of
A. Show that x = Py are eigenvectors of A. (Show the
details of your work.)
13.
A [-5o OJ2 ' P
=
=
[: -:J
4 -2JI
[
-3
3
14. A =
[
3
0 :1
10
-1:1
9.
l-1: 39
-24 40 -15
7.
[4
-6
0
8.
-5
l-'
-5 1:1
-9 -9 13
10. (Orthonormal basis) Illustrate Theorem 2 with further
examples.
11. (No basis) Find further 2 X 2 and 3 X 3 matrices
without eigenbases.
12. PROJECT. Similarity of Matrices. Similarity is
basic, for instance in designing numeric methods.
(a) Trace. By definition, the trace of an 11 X 11 matrix
A = [ajk] is the sum of the diagonal entries,
trace A = all
+
a22
+ ... + ann'
Show that the trace equals the sum of the eigenvalues,
each counted as often as its algebraic multiplicity
indicates. Illustrate this with the matrices in Probs. 1.
3,5.7,9.
(b) Trace of product. Let B = [bjk } be 11 X n. Show
that similar matrices have equal traces, by first
proving
n
n
trace AB = ~ ~
15. A = [
= trace BA.
i~ll~l
[1
3
-6
1
6
l-~ ~0 ~:1· l~0
=
P =
-5
15
~ ~]
0
10
0
0
~1
1
TRANSFORMATION TO PRINCIPAL AXES.
CONIC SECTIONS
119-281
What kind of conic section (or pair of straightlines) is given
by the quadratic form? Transform it to principal axes.
Express xT = [Xl X2] in terms of the new coordinate vector
yT = [Yl ."2]' as in Example 6.
19. X1 2 + 24xlX2 - 6X22 = 5
20. 3X12 + 4V3x l X2 + 7X 2 2 = 9
21. 3-'"1 2 - 8XlX2 - 3-'"22 = 0
23.
4X12
+
+
24. 7X12 (c) Find a reiationship between A in (4) and A = PAP-I.
P=
P =
2]
18. A
2J
-2
l : -~ ~] , l~
17. A =
22. 6X12
ailbli
4
-4
25.
X1
2
-
16-'"1-'"2 -
6X22 =
24xlX2
12xl-'"2
20
+ 2X22 = 10
= 144
2V3xlX2
+
X2
2
=
35
CHAP. 8
356
Linear Algebra: Matrix Eigenvalue Problems
+ 22xIX2 + 3X22 = 0
12xI2 + 32xIX2 + 12x22 =
6.5xI 2 + 5.0X]x2 + 6.5x22
26. 3X]2
27.
28.
Q(x)
112
36
=
29. (Definiteness) A quadratic fonn Q(x) = xT Ax and its
(symmetric!) matrix A are called (a) positive definite
if Q(x) > 0 for all x
0, (b) negative definite if
Q(x) < 0 for all x
O. (e) indefinite if Q(x) takes
both positive and negative values. (See Fig. 160.) [Q(x)
and A are called positive semidefinite (negative
semidefinite) if Q(x) ~ 0 (Q(x) ~ 0) for all x.J A
necessary and sufficient condition for positive
definiteness is that all the "principal minors" are
positive (see Ref. [B3]. vol. 1. p. 306), that is.
"*
"*
lall
(/]2
all> 0,
.1:2
(a)
PosItIve defimte form
Q(x)
aI21 >0,
a22
(b) Negative defm Ite form
all
aI2
a]3
a I2
a22
a23 >0,
a]3
a 23
a33
detA > O.
Q(x)
Show that the form in Prob. 23 is positive definite,
whereas that in Prob. 19 is indefinite.
30. (Definiteness) Show that necessary and sufficient for
(a), (b), (c) in Prob. 29 is that the eigenvalues of A are
(a) all positive, (b) all negative, tc) both positive and
negative. Hint. Use Theorem 5.
8.5
(c) Indefinite form
Fig. 160.
QuadratiC forms in two variables
Optional
Complex Matrices and Forms.
The three classes of real matrices in Sec. 8.3 have complex counterparts that are of practical
interest in certain applications, mainly because of their spectra (see Theorem 1 in this
section), for instance, in quantum mechanics. To define these classes, we need the
following standard
Notations
A
= [ajk]
is obtained from A
= [lljk]
(a,.!! real) with its complex conjugate ajk
= a + i{3
= [al0] is the transpose
by replacing each entry lljk
= 0'-
i{3. Also, AT
of A, hence the conjugate transpose of A.
E X AMP L E 1
Notations
3 + 4i
If A =
[
6
J,
1- i
:>. - 5i
then A = [3 - 4i
6
1+
i ]
2+5;
and
-T
A
=
[3 - 4;
1+;
6 ]
2 + 5;
••
SEC. 8.5
Complex Matrices and Forms.
DEFINITION
Optional
357
Hermitian, Skew-Hermitian, and Unitary Matrices
A square matrix A
= [aid] is called
Hermitian
if r=A,
skew-Hermitian
if
unitary
if AT = A-I.
AT =-A,
that is,
akj
=
ajk
that is,
akj
=
-ajk
The first two classes are named after Hermite (see footnote 13 in Problem Set 5.8).
From the definitions we see the following. If A is Hermitian. the entries on the main
diagonal must satisfy ajj = ajj; that is, they are rea1. Similarly, if A is skew-Hermitian,
then ajj = -aii' If we set aij = 0' + i{3, this becomes 0' - i{3 = -(0' + i(3). Hence
0' = 0, so that aji must be pure imaginary or O.
E X AMP L E 2
Hermitian, Skew-Hermitian, and Unitary Matrices
A=
4
[ 1+ 3i
1,
3i
B= [
-2
c=
t- i
2'
[
!V3
are Hennitian, skew-Hennitian, and unitary matrices, respectively, as you may verify by using the definitions.
If a Hermitian matrix is real, then AT = AT = A. Hence a real Hermitian matrix is a
symmetric matrix (Sec. 8.3.).
Similarly, if a skew-Hermitian matrix is real, then AT = AT = -A. Hence a real
skew-Hermitian matrix is a skew-symmetric matrix.
Finally, if a unitary matrix is real, then AT = AT = A -1. Hence a real unitary matrix
is an orthogonal matrix.
This shows that Hennitian, skew-HeI7l1itian, and unitary matrices generalize symmetric,
skew-symmetric, and orthogonal matrices, respectively.
Eigenvalues
It is quite remarkable that the matrices under consideration have spectra (sets of eigenvalues;
see Sec. 8.1) that can be characterized in a general way as follows (see Fig. 161).
1m A
I
Skew-Hermitian (skew-symmetric)
Unitary (orthogonal)
Hermitian (symmetric)
ReA
Fig. 161. Location of the eigenvalues of Hermitian,
skew-Hermitian, and unitary matrices in the complex A-plane
CHAP. 8
358
Linear Algebra: Matrix Eigenvalue Problems
Eigenvalues
THEOREM 1
(a) The eigenvalues
qf a Hermitian matrix tand tllUS of a symmetric matrix) are
real.
(b) The eigenvalues (~f a skew-Hermitial1 matrix (and thus
matrix) are plIre imaginwy or ::,ero.
qf a skew-symmetric
(e) The eigenvalues of a unitary matrix (and thus of an orthogonal matrix) have
absolute FaIlle L
E X AMP L E 3
Illustration of Theorem 1
For the matrices in Example 2 we find by direct calculation
Matrix
A
B
C
Hermitian
Skew-Hennitian
Unitary
Charactenstic Equation
A2 - llA + I~ = 0
A2 - liA + 8 = 0
A2 - iA - 1 = 0
Eigenvalues
9,
4i,
2
-2i
~V3 +
ii. -iV3 + ~i
•
PROOF
We prove Theorem L Let A be an eigenvalue and x an eigenvector of A. Multiply Ax =
Ax from the left by xT. thus xTAx = AXTX. and divide by xTx = XIXI + ... + xnxn =
IXll2 + ... + IXnI2, which is real and not 0 because x *- O. This gives
A=
(1)
(a) If A is Hennitian, AT = A or AT = A and we show that then the numerator in (l) is
real, which makes A reaL xT Ax is a scalar; hence taking the transpose has no effect. Thus
(2)
Hence, xT Ax equals its complex conjugate, so that it must be reaL (a
implies b = 0.)
(b) If A is skew-Hermitian, AT = -A and instead of (2) we obtain
+ ib
a - ib
(3)
so that xT Ax equals mmus its complex conjugate and is pure imaginary or O.
(a + ib = -(a - ib) implies a = 0.)
(e) Let A be unitary. We take Ax = Ax and its conjugate transpose
-
(AX)
T
-
T
-
= (Ax) = AXT
and multiply the two left sides and the two right sides,
SEC. 8.5
Complex Matrices and Forms.
But A is unitary.
Optional
AT = A -I,
359
so that on the left we obtain
Together, "TX = IAI2"Tx. We now divide by "TX (* 0) to get IAI2 = 1. Hence IAI
This proves Theorem 1 as well as Theorems I and 5 in Sec. 8.3.
1.
•
Key properties of orthogonal matrices (invariance of the inner product, orthonormality of
rows and columns; see Sec. 8.3) generalize to unitary matrices in a remarkable way.
To see this, instead of R n we now use the complex vector space
of all complex
vectors with 11 complex numbers as ~omponents, and complex numbers as scalars. For
such ,complex vectors the inner product is defined by (note the overbar for the complex
conjugate)
en
(4)
The length or norm of such a complex vector is a real number defined by
THE 0 REM 2
Invariance of Inner Product
A unitary transformation, that is, y = Ax with a unitaJ:': matrix A, preserves the
value of the inner product (4), hence also the norm (5).
PROOF
The proof is the same as that of Theorem 2 in Sec. 8.3, which the theorem generalizes.
In the analog of (9), Sec. 8.3, we now have bars,
T
u·v = fi v
=
-- T
(Aa) Ab
=
_T-T
_
_
a A Ab = aTlb = aTb
=
a·b.
•
The complex analog of an orthonormal systems of real vectors (see Sec. 8.3) is defined
as follows.
DEFINITION
Unitary System
A unitary system is a set of complex vectors satisfying the relationships
if j
(6)
*k
if j = k.
Theorem 3 in Sec. 8.3 extends to complex as follows.
THEOREM 3
Unitary Systems of Column and Row Vectors
A complex square matrix is unitary
row vectors) fOl71l a unitary system.
if and only if its
column vectors (and also its
CHAP. 8
360
PROOF
Linear Algebra: Matrix Eigenvalue Problems
The proof is the same as that of Theorem 3 in Sec. 8.3. except for the bars required in
AT = A -1 and in (4) and (6) of the present section.
•
THE 0 REM 4
Determinant of a Unitary Matrix
Let A be a unitary Inarrix. Then iTS determinanT has absolute m/ue one, that is,
AI
Idet
PROOF
= 1.
Similarly as in Sec. 8.3 we obtain
I = det (AA -1) = det (AAT) = det A det AT = det A det A
= det A det A = Idet A12.
Hence Idet AI
E X AMP L E 4
=
1 (where det
•
A may now be complex).
Unitary Matrix Illustrating Theorems lc and 2-4
For the vectors
and with
3
T
= [2 -il and bT = [I + i 4i] we get aT
0.8i
0.6
A= [
0.6
J
also
0.8i
A3 =
as one can readily verify. This gives (Aa)TAb = -2
columns form a unitary system.
a}T3}
=
[:J
+
= -0.8i· 0.8i + 0.62 = I.
[2
i]T amlaTb = 2(1 + i) - 4 = -2 + 2i
-0.8 + 3.2;J
<\b =
and
[
-2.6
+ 0.6i
,
2i. illustrating Theorem 2. The matrix is unitary. Its
a}T 32 = -0.8i· 0.6 + 0.6' 0.8i = O.
~T 32 = 0.6 + (-0.8i)0.8i = I
2
and so do its rows. Also. det A = -1. The eigenvalues are 0.6 + O.Si and -U.6
[I I]T and [I _I]T. respectively.
+
O.Si, with eigenvectors
•
Theorem 2 in Sec. 8.4 on the existence of an eigenbasis extends to complex mahices as
follows.
THEOREM 5
Basis of Eigenvectors
A Hen71itian, skew-Hemzitian, or unitGl)' matrix has a basis of eigenvectors for
that is a unitary system.
en
For a proof see Ref. [B3], vol. 1, pp. 270-272 and p. 244 (Definition 2).
E X AMP L E 5
Unitary Eigenbases
The matrices A, E, C in Example 2 have the following unitary systems of eigenvectors, as you should verify.
I
A:
---r= [I - 3i
E:
--[1-2;
C:
-[I
5]T
\. 35
I
(A
-5]T
vTo
I
V2
I]T
(A = ~(i
=
(A
I
9),
- - f l - 3i
= -2i),
I
vTo[5
+ V3»,
-2]T
Vi4
I
-[I
V2
1
+ 2i]T
-I]T
(A
(A
(A
=
2)
= 4i)
= !(i - V3».
•
SEC. 8.5
Optional
Complex Matrices and Forms.
361
Hermitian and Skew-Hermitian Forms
The concept of a quadratic form (Sec. 8.4) can be extended to complex. We call the
numerator "TAx in (1) a form in the components Xl> • . • , Xn of x, which may now be
complex. This form is again a sum of n 2 terms
n
"TAx
=
n
2. 2.
ajkXjXk
j~l k~l
(7)
+ ................. .
A is called its coefficient matrix. The fOlTn is called a Hermitian or skew-Hermitian
form if A is Hermitian or skew-Hermitian, respectively. The value qf a Hermitianfonn
is real. and that of a skew-Hennitiall form is pllre imaginw}' or z.ero. This can be seen
directly from (2) and (3) and accounts for the importance of these forms in physics. Note
that (2) and (3) are valid for any vectors because in the proof of (2) and (3) we did not
use that x is an eigenvector but only that "TX is real and not O.
E X AMP L E 6
Hermitian Form
For A in Example 2 and, say, x = [I
xTAx=[l-i
-SiJ
[
4
1
+ 3;
+
i
SilT we get
1- 3iJ [1 + iJ
7
=
[I -
. . [4(1 + i) + (I - 3i)' Si] _ 223.
-SI]
I
Si
-
(l
+ 3;)(1
t- i)
•
+ 7· S;
Clearly, if A and x in (4) are real, then (7) reduces to a quadratic form, as discussed in
the last section .
....... 1. (Verification) Verify the statements in Examples 2
and 3.
2. (Product) Show (BA{ = - AB for A and B in
Example 2. For any n X n Hermitian A and
skew-Hermitian B.
3. Show that (ABC{ = -C-1BA for any n X n
Hermitian A, skew-Hermitian B, and unitary C.
4. (Eigenvectors) Find eigenvectors of A, B, C in
Examples 2 and 3.
15-111
7.
0
9.
[5:
EIGENVALUES AND EIGENVECTORS
-i
2
6. [0
2i
2~J
s]
0
5i
Are the matrices in Probs. 5-11 Hermitian? SkewHermitian? Unitary? Find their eigenvalues (thereby
verifying Theorem 1) and eigenvectors.
5. [4 iJ
-~:J
r-:
8.
10.
[I :i
I
+
i
0
1- i
: i]
[~
~J
362
CHAP. 8
Linear Algebra: Matrix Eigenvalue Problems
COMPLEX FORMS
113-151
Is the given matrix lcall itA) Hermitian or skew-Hermitian?
Find x:TAx. (Show all (he details.) a, b, e, k are reaL
13. [ O.
12. PROJECT. Complex Matrices
(a) Decomposition. Show (hat any square matrix may
be written as the sum of a Hermitian and a
skew-Hermitian matrix. Give examples.
(b) Normal matrix. This important concept denotes
a matrix that commutes with its conjugate transpose,
AAT = ATA. Prove that Hermitian, skew-Hermitian.
and unitary matrices are normal. Give corresponding
examples of your own.
(c) Normality criterion. Prove (hat A is normal if and
only if the Hermitian and skew-Hermitian matrices in
(a) commute.
(d) Find a simple matrix that is nol normal. Find a
nOimal matrix that is not Hermitian, skew-Hermitian.
or unitary.
(e) Unitary matrices. Prove that the product of two
unitary 11 X 11 mau'ices and the inverse of a unitary
matrix are unitary. Give examples.
(f) Powers of unitary matrices in applications may
sometimes be very simple. Show that C 12 = I in
Example 2. Find further examples.
- 3;
-31
14. [
0
J.
x = [4
h
a.
+ ;CJ
b - Ie
+
3 -
'
~J
I
X =
k
[Xl]
X2
15.
16. (Pauli spin matrices) Find the eigenvalues and
eigenvectors of the so-called Pallli Spill111afriees and show
that SxSy = is,, SySX = -iSz, Sx2 = Sy2 = S/ = I,
where
Sy
S
z
=
=
0
[ i
-iJ
0'
I OJ
[0-1
C H A-P T E R-8::: R £ V-I E W-=Q U EST ION 5 AND PRO B L EMS
1. In solving an eigenvalue problem. what is given and
what is sought?
2. Do there exist square matrices without eigenvalues?
Eigenvectors corresponding to more than one
eigenvalue of a given matrix?
14.4
10. [
-11.2
11. [-14
-10
3. What is the defect? Why is it important? Give examples.
4. Can a complex matrix have real eigenvalues? Real
eigenvectors? Gi\'e reasons.
5. What is diagonalization of a matrix? Transformation of
a form to principal axes?
12.
r
7. Does a 3 X 3 matrix always have a real eigenvalue?
8. Give a few typical applications in which eigenvalue
problems occur.
~-:BJ
13.
Find an eigenbasis and diagonalize. (Show the details.)
9.
101
[ -144
72J
-103
r
14.
10J
11
11. =
-2
-7
: : -:]
-4
DIAGONALIZATION
102.6
I: I: -:], 18
-12
6. What is an eigenbasis'? When does it exist? Why is it
important?
-11.2J
-:
r
-4
-~ -~
11
10
Summary of Chapter 8
~ ~-iil
SIMILARITY
Verify that A and
Here, A, Pare:
A = P- 1AP have the same spectrum.
[-~: ~~ ~~], [~
28
-5l:r ,.
-14
29
2
17.
[~
2
I
~J
3.8
15. [
2.4
16.
363
:
8
~]
0
-1
=~], [~ ~
:]
-1
4
3
2
Transformation to Canonical Form. Reduce the quadratic
form to principal axes.
18. 11.56x12 + 20.16x1-'2 + 17 .44x22 = 100
19. 1.09x/ - 0.06X1X2 + \.0\ xl = 1
20. 14x12 + 24xIX2 - 4X22 = 20
1FEIJF~-HA-p..T£R- 8:=
Linear Algebra: Matrix Eigenvalue Problems
The practical importance of matrix eigenvalue problems can hardly be overrated.
The problems are defined by the vector equation
Ax
(I)
=
Ax.
A is a given square matrix. All matrices in this chapter are square. 11. is a scalar. To
solve the problem (1) means to determine values of A, called eigenvalues (or
characteristic values) of A, such that (I) has a nontrivial solution x (that is,
x =1= 0), called an eigenvector of A corresponding to that A. An 11 X 11 matrix has
at least one and at most 11 numerically different eigenvalues. These are the solutions
of the characteristic equation (Sec. 8.1)
all -
(2)
D(A)
= det (A - AI) =
a21
anI
A
a12
a22 -
an2
ain
A
a2n
ann -
= O.
A
D(A) is called the characteristic determinant of A. By expanding it we get the
characteristic polynomial of A, which is of degree n in A. Some typical applications
are shown in Sec. 8.2.
Section 8.3 is devoted to eigenvalue problems for symmetric (AT = A),
skew-symmetric (AT = -A), and orthogonal matrices (AT = A-I). Section 8.4
concerns the diagonalization of matrices and the transformation of quadratic forms
to principal axes and its relation to eigenvalues.
Section 8.5 extends Sec. 8.3 to the complex analogs of those real matrices,
called Hermitian (AT = A). skew-Hermitian (AT = -At and unitary matrices
(AT = A-I). All the eigenvalues of a Hermitian matrix (and a symmetric one) are
real. For a skew-Hermitian (and a skew-symmetric) matrix they are pure imaginary
or zero. For a unitary (and an orthogonal) matrix they have absolute value l.
CHAPTER
·
.
9
Vector Differential Calculus.
Grad, Div, Curl
This chapter deals with vectors and vector functions in 3-space, the space of three
dimensions with the usual measurement of distance (given by the Pythagorean theorem).
This includes 2-space (the plane) as a special case. It extends the differential calculus to
those vector functions and the vector fields they represent. Forces, velocities, and various
other quantities are vectors. This makes the algebra, geometry, and calculus of these vector
functions the natural instrument for the engineer and physicist in solid mechanics. fluid
flow, heat flow, electrostatics, and so on. The engineer must understand these vector
functions and fields as the basis of the design and consuuction of systems, such as
airplanes, laser generators, and robots.
In Secs. 9.1-9.3 we explain the basic algebraic operations with vectors in 3-space.
Calculus begins in Sec. 9.4 with the extension of differentiation to vector functions in a
simple and natural fashion. Application to curves and their use in mechanics follows in
Sec. 9.5.
We finally discuss three physically important concepts related to scalar and vector fields,
namely, the gradient (Sec. 9.7), divergence (Sec. 9.8), and curl (Sec. 9.9). (The use of
these concepts in integral theorems follows in the next chapter. Their form in cunilinear
coordinates is given in App. A3.4.)
We shall keep this chapter independe1lt of Chaps. 7 alld 8. Our present approach is in
harmony with Chap. 7, with the restriction to two and three dimensions providing for a
richer theory with basic physical, engineering, and geometric applications.
Prerequisite: Elementary use of second- and third-order determinants in Sec. 9.3.
Sections that may be omitted in a shorter course: 9.5, 9.6.
References and Answers to Problems: App. I Part B, App. 2.
9.1
Vectors in 2-Space and 3-Space
In physics and geometry and its engineering applications we use two kinds of quantities:
scalars and vectors. A scalar is a quantity that is determined by its magnitude; this is the
number of units measured on a suitable scale. For instance, length. voltage. and temperature
are scalars.
A vector is a quantity that is determined by both its magnitude and its direction. Thus
it is an arrow or directed line segment. For instance, a force is a vector, and so is a
velocity, giving the speed and direction of motion (Fig. 162).
364
SEC 9.1
365
Vectors in 2-Space and 3-Space
We denote vectors by lowercase boldface letters a, b. v, etc. In handwriting you may
use arrows, for instance ii (in place of a), b, etc.
A vecror (arrow) has a tail, called its initial point, and a tip, called its terminal point.
This is motivated in the translation (displacement without rotation) of the triangle in Fig.
163, where the initial point P of the vector a is the original position of a point, and the
terminal point Q is the terminal position of that point, its position after the translation.
The length of the arrow equals the distance between P and Q This is called the length
(or magnitude) of the vector a and is denoted by lal. Another name for length is norm
(or Euclidean nonn).
A vector of length 1 is called a unit vector.
Velocity
--~Earth
I
I
I
I
"
I
"
Force
\
\
\
I
I
I
Fig. 162.
Sun
Fig. 163. Translation
Force and velocity
Of course, we would like to calculate with vectors. For instance, we want to find the
resultant of forces or compare parallel forces of different magnitude. This motivates our
next ideas: to define compollents of a vector. and then the two basic algebraic operations
of vector addition and scalar multiplication.
For this we must first define equality of vectors in a way that is practical in connection
with forces and other applications.
DEFINITION
Equality of Vectors
Two vectors a and b are equal, written a = b, if they have the same length and the
same direction [as explained in Fig. 164; in particular, note (B)l- Hence a vector
can be arbitrarily translated; that is, its initial point can be chosen arbitrarily_
\~
//:
Equal vectors,
a=b
(Al
7/
~
b
Vectors having
the same length
but different
direction
Vectors having
the same direction
but different
length
Vectors having
different length
and different
direction
eE)
(e)
(D)
Fig. 164.
(A) Equal vectors. (B)-(D) Different vectors
366
CHAP.9
Vector Differential Calculus. Grad, Div, Curl
Components of a Vector
We choose an xyz Cartesian coordinate system l in space (Fig. 165). that is, a usual
rectangular coordinate system with the same scale of measurement on the three mutually
perpendicular coordinate axes. Let a be a given vector with initial point P: (xl' YI, ZI) and
tenninal point Q: (X2' Y2, Z2)' Then the three coordinate differences
(1)
are called the components of the vector a with respect to that coordinate system, and we
write simply a = [at> a2, a 3]. See Fig. 166.
The length lal of a can now readily be expressed in tenns of components because from
(l) and the Pythagorean theorem we have
(2)
E X AMP L E 1
Components and Length of a Vector
The vector a with initial point P: (4, 0, 2) and terminal point Q: (6, -1. 2) ha, the "omponents
al =
6 - 4 = 2,
az
= -1 -
0 = - I,
{l3
= 2 - 2 = O.
Hence a = [2. -I. OJ. (Can you sketch a, as in Fig. 166'!) Equation (2) gives the length
If we "hoose (-I, 5, 8) as the initial point of a, the corresponding terminal point is (I, 4, 8).
If we choose the origin (0. O. 0) as the initial point of a, the conesponding terminal point is (2, - I, 0); its
coordinate, equal the components of a. This suggests that we can determine each point in space by a vector,
•
called the positiol! I'ector of the point. as follows.
A Cartesian coordinate system being given, the position vector r of a point A: (x. y, z)
is the vector with the origin (0, 0, 0) as the initial point and A as the terminal point (see
Fig. 167). Thus in components, r = [x, y, z]. This can be seen directly from (1) with
Xl = .\'1 = ;::1 = O.
z
,,
,,
I(
r
/"
---x
y
x
Fig. 165. Cartesian
coordinate system
-_
1
1
1
1
\ \ 11""'-----::> ___
--_
-"
\1"
-"""y
....
Fig. 166. Components
of a vector
Fig. 167. Position vector r
of a point A: (x, y, z)
INamed after the French philosopher and mathematician RENA TUS CARTESIUS. latinized for RENE
DESCARTES (1596--1650), who invented analytic geometry. His basic work Geometrie appeared in 1637. as
an appendix to his Discours de fa mitftode.
SEC 9.1
367
Vectors in 2-Space and 3-Space
Furthennore, if we translate a vector a, with initial point P and terminal point Q, then
corresponding coordinates of P and Q change by the same amount, so that the differences
in (1) remain unchanged. This proves
THEOREM 1
Vectors as Ordered Triples of Real Numbers
A fixed Cartesian coordinate system being given, each vector is uniquely determined
by its ordered triple of corresponding components. Conversely, to each ordered triple
of real numbers (ab a2, a3) there corresponds precisely one vector a = [aI' a2, a3],
with (0, 0, 0) corresponding to the zero vector 0, which has length 0 and no direction.
Hence a vector equation a = b is equivalent to the three equations al = bl ,
a2 = b2, a3 = b3 for the components.
We now see that from our "geometric" definition of a vector as an arrow we have arrived
at an "algebraic" characterization of a vector by Theorem 1. We could have started from
the latter and reversed our process. This shows that the two approaches are equivalent.
Vector Addition, Scalar Multiplication
Applications suggest calculation with vectors that are practically useful and are almost as
simple as the arithmetics for real numbers. The first is addition and the second is
multiplication by a number.
DEFINITION
Addition of Vectors
The sum a + b of two vectors a = [ab a2,
adding the corresponding components,
a3J
and b = [bI> b2, b3J is obtained by
(3)
Fig. 168. Vector
addition
Geometrically, place the vectors as in Fig. 168 (the initial point of b at the terminal
point of a); then a + b is the vector drawn from the initial point of a to the terminal
point of b.
For forces, this addition is the parallelogram law by which we obtain the resultant of two
forces in mechanics. See Fig. 169.
Figure 170 shows (for the plane) that the "algebraic" way and the "geometric way" of
vector addition give the same vector.
Fig. 169.
Resultant of two forces (parallelogram law)
CHAP.9
368
Vector Differential Calculus. Grad, Div, Curl
Basic Properties of Vector Addition.
(see also Figs. 171 and 172)
Familiar laws for real numbers give immediately
(a)
a+b=b+a
(b)
tu + v) + w = u + (v + w)
(C)
a+O=O+a=a
(d)
a + (-a) = O.
( COllllllutativity)
(Associativity)
(4)
Here -a denotes the vector having the length lal and the direction opposite to that of a.
In (4b) we may simply write u + v + w, and similarly for sums of more than three
vectors. Instead of a + a we also write 2a, and so on. This (and the notation -a used
just before) motivates defining the second algebraic operation for vectors as follows.
YI
r~-----------
2
C
r
u: - - : - :
----
I
L:~
___
b!...l_ _
x
Fig. 170.
Vector addition
Fig. 171. Cummutativity
of vector addition
Fig. 172. Associativity
of vector addition
Scalar Multiplication (Multiplication by a Number)
DEFINITION
The product ca of any vector a = [aI, {/2' a3] and any scalar c (real number c) is
the vector obtained by multiplying each component of a by c,
I
a
2a
-a
(5)
-.! a
2
Fig. 173. Scalar
multiplication
[multiplication of
vectors by scalars
(numbers)]
*"
Geometrically, if a
0, then ca with c > 0 has the direction of a and with c < 0
the direction opposite to a. In any case, the length of ca is leal = lellal, and ca = 0
if a = 0 or c = 0 (or both). (See Fig. 173.)
Basic Properties of Scalar Multiplication.
From the definitions we obtain directly
(a)
c(a + b) = ca + cb
(b)
(c + k)a = ca + ka
(c)
c(ka) = (ck)a
(d)
la = a.
(6)
(written cka)
SEC. 9.1
369
Vectors in 2-Space and 3-Space
You may prove that (4) and (6) imply for any vector a
(a)
Oa
=
0
(b)
(-I)a
=
-a.
(7)
+
Instead of b
E X AMP L E 2
(-a) we simply write b - a (Fig. 174).
Vector Addition. Multiplication by Scalars
With respect to a given coordinate system. let
a = [4. O. I]
Then -a
= [-4, o.
-I].
7a
and
= [28,0,71,
2(a - b)
a + b
b
= [6,
= 2[2, 5, ~l = [4,
=
[2.
5.
n
-5. nand
10. ~l
= 2a
•
- 2b.
Besides a = lab a2, a3] another popular way of writing vectors is
Unit Vectors i, j, k.
In this representation, i, j, k are the unit vectors in the positive directions of the axes of
a Cartesian coordinate system (Fig. 175). Hence, in components,
i = [1,
(9)
0,
0],
j
=
[0,
l.
0],
k
= [0,
o.
1]
and the right side of (8) is a sum of three vectors parallel to the three axes.
E X AMP L E 3
i j k Notation for Vectors
•
In Example 2 we have a = 4i + k, b = 2i - 5j + ~k, and so on.
All the vectors a = [aI, lI2, a3] = ali + a2j + a3k (with real numbers as components)
form the real vector space R3 with the two algebraic operations of vector addition and
scalar multiplication as just defined. R3 has dimension 3. The triple of vectors i, j, k is
called a standard basis of R3. A Cartesian coordinate system being given. the
representation (8) of a given vector is unique.
Vector space R3 is a model of a general vector space, as discussed in Sec. 7.9, but is
not needed in this chapter.
ZI
,kl ,
a/
-a/<
,r
b
"....
''l~
~
",-a
Fig. 174. Difference
of vectors
/~
x
y
Fig. 175. The unit vectors i, j, k
and the representation (8)
CHAP. 9
370
:u
11-61
Vector Differential Calculus. Grad, Div, Curl
--.-
COMPONENTS AND LENGTH
Find the components of the vector v with given initial point
P and terminal point Q. Find Ivl. Sketch Ivl. Find the unit
vector in the direction of v.
Q: (5, -2,0)
1. P: (3, 2, 0),
Q: (-4, -4. -4)
2. P: (1. I, 1).
3.
4.
5.
6.
P:
P:
P:
P:
(1. 0, 1.2),
(2, -2,0),
(4, 3, 2),
(0, O. 0).
Q: (0, 0, 6.2)
Q: (0,4,6)
Q: (-4, -3,2)
Q: (6, 8, 10)
-~2J
Given the components vI> V2, V3 of a vector v
and a particular initial point P, find the corresponding
terminal point Q and the length of v.
7. 3, -1,0;
P: (4.6,0)
8. 8,4, -2;
P: (-8, -4,2)
L
9.
!, 2,~;
10. 3,2.6;
11. 4,~, -~;
12. 3, -3,3;
~ 3-20
I
P: (0. -~. ~);
P: (0. O. 0)
P: (-4,~, 2)
P: (1,3, -3)
VECTOR ADDITION AND
SCALAR MULTIPLICATION
Let a = [2, - I, 0] = 2i b = [-4, 2, 5] = -4i + 2j
Find:
13. 2a, -a, -~a
15. 5(a - c). 5a - 5c
16. (3a - 5b) + 2c, 3a +
17. 6a - 4b + 2c, 2(3a 18. (lIla l)a, (1IIcl)c
j,
+
5k, c = [0,0, 3J = 3k.
14. a
+
2b, 2b
+
a
(-5b + 2c)
2b + c)
19. a + b + c, -3a - 3b - 3e
20. la + hi, lal + lb:
21. What laws do Probs. 14-17 illustrate?
22. Prove (4) and (6).
23. Find the midpoint of the segment PQ in Probs. 7 and 9.
=-1-281
Find the
24. p =
v =
25. P =
26. P =
27. P =
28. P =
FORCES
resultant (in components) and its magnitude.
[1,2,0]. q = [0,4, -I], U = [4.0, -3],
[6,2,4]
[2,2,2], q = [-4, -4,0], U = [2,2,7]
[-I, -3, -5]. q = [6.4, 2J, u = [-5, -1. 3]
[8,2. -4], q = 3p, U = -5p
[3.0, -2], q = [2,5, I], u = 4q
29. Find v so that v, p, q. U in Prob. 25 are in equilibrium.
30. For what c is the resultant of [3, I, 7], [4, 4, 5], and
[3. 2. c] parallel to the x\"-plane?
31. Find forces P. q, U in the direction of the coordinate
axes such that p, q, U, v = [2.3.0], w = [7. -I, 11]
are in equilibrium. Are p. q, U uniquely determined?
32. If Ipi = I and Iql = 2, what can be said about the
magnitude and direction of the resultant? Can you think
of an application where this matters?
33. Same question as in Prob. 32 if IPI = 3. Iql = 2, :ul = I.
34. (Relative velocity) If airplanes A and B are moving
southwest with speed IVAI = 500 mph and northwest
with speed IVBI = 400 mph. respectively, what is the
relative velocity v = VB - VA of B with respect to A?
35. (Relative velocity) Same question as in Prob. 34 for
two ships moving northwest with speed IVAI = 20 knots
and northeast with speed IVBI = 25 knots.
36. (Reflection) If a ray of light is reflected once in each
of two mutually perpendicular mirrors, what can you
say about the reflected ray?
37. (Rope) Find the magnitude of the force in each rope
in the figure for any weight wand angle a.
38. TEAM PROJECT. Geometric Applications. To
increase your skill in dealing with vectors, use vectors
to prove the following (see the figures).
(a) The diagonals of a parallelogram bisect each other.
(b) The line through the midpoints of adjacent sides
of a parallelogram bisects one of the diagonals in the
ratio I: 3.
(e) Obtain (b) from (a).
(d) The three medians of a triangle (the segments from
a vertex to the midpoint of the opposite side) meet at
a single point, which divides the medians in the ratio
2:1.
(e) The quadrilateral whose vertices are the midpoints
of the sides of an arbitrary quadrilateral is a
parallelogram.
(n The four space diagonals of a parallelepiped meet
and bisect each other.
(g) The sum of the vectors drawn from the center of
a regular polygon to its vertices is the zero vector.
~
a
w
Problem 37
Team Project 38(a)
o~
a
Team Project 38{d)
c
C _ b
D~B
~
A
a
Team Project 38(e)
371
SEC. 9.1
Inner Product (Dot Product)
9.2
Inner Product (Dot Product)
We shall now define a multiplication of two vectors that gives a scalar as the product and
is suggested by various applications. in particular when angles between vectors and lengths
of vectors are involved.
DEFINITION
Inner Product (Dot Product) of Vectors
The inner product or dot product a" b (read "a dot b") of two vectors a and b is
the product of their lengths times the cosine of their angle (see Fig. 176),
a" b
(1)
= lallbl cos '}'
if a*O,b*O
if a = 0 or b = O.
a"b = 0
The angle '}', 0 ~ '}' ~ 7T, between a and b is measured when the initial points of the
vectors coincide, a~ in Fig. 176. In components, a = [a!> a2, ag], b = [bI> b 2, bg],
and
(2)
o or b
The second line in (l) is needed because '}' is undefined when a
derivation of (2) from (1) is shown below.
2J(_
b
a.b>O
Fig. 176.
t
a[l
b
-
a.b=O
O. The
~b
a·b<O
Angle between vectors and value of inner product
Orthogonality. Since the cosine in (1) may be positive, 0, or negative. so may be the
inner product (Fig. 176). The case that the inner product is zero is of particular practical
interest and suggests the following concept.
A vector a is called orthogonal to a vector b if a" b = O. Then b is also orthogonal to
a, and we call a and b orthogonal vectors. Clearly, this happens for nonzero vectors if
and only if cos '}' = 0; thus '}' = rr/2 (90°). This proves the important
THEOREM 1
Orthogonality
The inner product of two nonzero vectors is 0 if and only if these vectors are
perpendicular.
372
CHAP 9
Vector Differential Calculus. Grad, Div, Curl
Length and Angle.
Equation (1) with b
(3)
lal
=
=
a gives a'a
=
lal 2. Hence
v;;a.
From (3) and (1) we obtain for the angle 'Y between two nonzero vectors
(4)
E X AMP L E 1
=
cos 'Y
Inner Product. Angle Between Vectors
Find the inner product and the lengths of a
vectors.
Solution.
a'b = J ·3
= [I.
+ 2 'l-2) + O· I
=
2, 0] and b
-I, lal
=
=
[3, - 2, I] as well as the angle between these
Va~ = "\ '5, Ibl
= v'b-b = v'J4, and l4) gives
the angle
a'b
y= arccos lallbl
= arccos (-0.1 1952) = 1.69061 = 96.865°.
•
From the definition we see that the inner product has the following properties. For any
vectors a, b, C and scalars ql, q2,
(a)
(5)
(qla
+ q2b)'C = qla'c + q2b ' c
(b)
a'b = b'a
a'a
~
(Linearity)
(Symmetry)
0
(c)
} (Positive-definitelless).
a'a = 0
if and only if
a=O
Hence dot multiplication is commutative [see (5b)] alld is distributive 'with re.\pect to
vector addition; in fact, from (Sa) with ql = I and q2 = I we have
(5a*)
(a
Furthermore, from (1) and Icos
+ b)'c = a'c + b'c
'YI
(6)
(Distributivity).
3 I we see that
la· bl 3 lallbl
(Cauchy-Schwarz inequality).
Using this and (3). you may prove (see Prob. 18)
(7)
la
+ bl
3 lal
+ Ibl
(Triangle inequality).
Geometrically, (7) with < says that one side of a triangle must be shorter than the other
two sides together; this motivates the name of (7).
A simple direct calculation with inner products shows that
(8)
la
+ bl 2 + la - bl 2 = 2(la1 2 + Ib1 2)
(Parallelogram equality).
Equations (6)-(8) play a basic role in so-called Hilbert .\paces (abstract inner product
spaces), which form the basis of quantum mechanics (see Ref. [GR7] listed in App. I).
SE.c. 9.2
373
Inner Product (Dot Product)
Derivation of (2) from (1). We write a = ali + a2j + ({3k and b = bli + b2 j
as in (8) of Sec. 9.1. If we substitute thi~ into a-b and use (5a*), we first have a
3 X 3 = 9 products
+ b3k,
~um
of
Now i, j, k are unit vectors, so that i- i = j -j = k - k = I by (3). Since the coordinate
axes are perpendicular, so are j, j, k, and Theorem I implies that the other six of those
nine products are 0, namely, j-j = j-j = j-k = k-j = k-j = j-k = O. But this reduces
our sum for a-b to (2).
•
Applications of Inner Products
Typical applications of inner products are shown in the following examples and in Problem
Set 9.2.
E X AMP L E 2
Work Done by a Force Expressed as an Inner Product
This is a major application. It concerns a body on which a cO/lSlOl/l force p acts. (For a l'C/r;able force.
see Sec. 10.1.) Let the body be given a displacement d. Then the work done by p in the displacement is defined as
W =
(9)
ipiidi cos a
=
pod,
that is. magnitude ipi of the force times length idi of the displacement times the cosine of the angle a between
p and d (Fig. 177). If a < 90°. as in Fig. 177. then W> O. If P and d arc orthogonal, then the work is zero
(why'!). If a> 90°. then W < O. which means that in the displacement one has to do work against the force.
•
(Think of swimming across a river at some angle a against the current.)
y
~/~
I
I
• I
'",8
d
Fig. 177.
E X AMP L E 3
Work done by a force
Fig. 178.
Example 3
Component of a Force in a Given Direction
What force in the rope in Fig. 178 will hold a car of 5000 lb in equilibrium if the ramp makes an angle of 25°
with the hori70ntal?
Solutioll. Introducing coordinates as shown. the weight is a = [0. -5000] because this force points
downward. in the negative .,·-direction. We have to represent a as a sum (resultant) of two forces. a = c + p,
where c is the force the car exerts on the ramp. which is of no interest to us. and p is parallel to the rope. of
magnimde (see Fig. 178)
ipl = lal cos l' = 5000 cos 65° =
2113 [Ib)
and direction of the unit vector U opposite to the direction of the rope; here l'
between a and p. Now a vector in the direction of the rope is
b = [-I. tan 25°] = [-I. 0.466311,
thus
,bl
= 90°
- 25°
= 1.10338.
= 65° is the angle
374
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
so that
U =
Since lUI
=
-
1
Tbf
b =
[0.90631, -0.42262].
I and cos y > 0, we see that we can also write our result as
Ipi = (Ial co, y)lul = a'u =
a' b
-lbi
=
5000·0.46631
1.l0338
= 2113 [Ib].
Answer: AbDUl 2100 lb.
•
Example 3 is typical of applications in which one uses the concept of the component or
projection of a vector a in the direction of a vector b (*- 0), defined by (see Fig. 179)
(10)
p
=
lal cos y.
Thus p is the length of the orthogonal projection of a on a straight line I parallel to b,
taken with the plus sign if pb has the direction of b and with the minus sign if pb has the
direction opposite to b; see Fig. 179.
a~
l~~-:
b
'-----v--------
P
(p>O)
Fig. 179.
Multiplying (10) by
(p=O)
(p<O)
Component of a vector a in the direction of a vector b
Ibi/lbi
= I, we have
a-b in the numerator and thus
p=
(11)
(b
*-
0).
If b is a unit vector, as it is often used for fixing a direction, then (11) simply gives
(12)
p
=
a-b
(Ibl =
1).
Figure 180 shows the projection p of a in the direction of b las in Fig. 179) and the
projection q = Ibl cos 'Y of b in the direction of a.
a
q
..-/"i
rZ\ :
~
p
Fig. 180.
Projections p of a on band q of b on a
SEC. 9.2
375
Inner product (Dot Product)
E X AMP L E 4
Orthonormal Basis
By definition. an orthonormal basis for 3-space is a basis (a. b. c) consisting of orthogonal unit vectors. It has
the great advantage that the determination of the coefficients in representations v = 11 a + 12 b + '3c of a given
vector v is very simple. We claim that '1 = a v. 12 = b V, 13 = Co v. Indeed. this follows simply by taking
the inner products of the representation with a, b, c, respectively, and using the orthononnality of the basis,
aov = Ilaoa + ' 2a o b + '3aoc = II, etc.
For example, the unit vectors i. j. k in (8), Sec. 9.1, associated with a Cartesian coordinate system form an
orthonormal basis. called the standard basis with respect to the given coordinate system.
•
0
E X AMP L E 5
0
Orthogonal Straight Lines in the Plane
Find the straight line LI through the point P: O. 3) in the -":I'-plane and perpendicular to the straight line
- 2y + 2 = 0; see Fig. 181.
Lz: x
Solution.
The idea is to write a general straight line L I : alx + a2.\" = c as a r = c with a = [aI, a2] "" 0
and r = [x. y]. according to (2). Now the line Ll * through the origin and parallel to Ll is a r = O. Hence, by
Theorem I, the vector a is perpendicular to r. Hence it is perpendicular to L J * and also to LI because LI and
Ll l' are parallel. a is called a nOl'mal vector of LI (and of Ll *).
Now a nonnal vector of the given line x - 2y + 2 = 0 is b = [I, -2]. Thus L J is perpendicular (0 L2 if
boa = al - 2a2 = 0, for instance, if a = [2, I]. Hence L J is given by 2x + Y = c. It passes through P: (I, 3)
when 2· I + 3 = c = 5. Answer: y = -2x + 5. Show that the point of intersection is (x • .1') = (1.6, 1.8). •
0
0
E X AMP L E 6
Normal Vector to a Plane
Find a unit vector perpendicular to the plane 4x
Solutioll.
+ 2y +
4;:: = -7.
Using (2). we may write an) plane in space as
where a = [al' a2. a31 "" 0 and r = [x. y, z]. The unit vector in the direction of a is (Fig. IS2)
I
n
=
Iaf a.
Dividing by lal. we obtain from (13)
nor
(14)
=
p
where
From (12) we see that p is the projection of r in the direction of n. This projection ha~ the same constant value
ellal for the position vec(Or I' of any point in the plane. Clearly this holds if and only if n is perpendicular to
the plane. n is called a unit nonnal vector of the plane (the other bcing - nl.
Furthermore. from this and the definition of projection it follows that Ipl is the distance of the plane from the
origin. Representation 04) is called Hesse's2 nonnal form of a plane. In our case, a = [4, 2, 41.
c = -7, lal = 6. n = ~a = [~. !.
and the plane has the distance 7/6 frum the origin.
•
n
n
2
Fig. 181.
2
3
Example 5
x
Fig. 182.
Normal vector to a plane
LUDWIG OTTO HESSE (1811-1874), Gennan mathematician who contributed to the theory of curves and
surfaces.
CHAP. 9
376
INNER PRODUCT
11-121
Let a
= [2. I. 4]. b = [-4, 0.3], c = [3, -2, 1]. Find
1. aob,boa
3.
5.
7.
9.
Vector Differential Calculus. Grad, Div, Curl
13a - 2bl, 12b - 3al
(aoblc, a(b·c)
(a - bloc, a·c - boc
ao(b - c), ao(c - b)
11. 6(a
+
b) • (a - b)
2. lal, Ibl, Icl
4. ao(b + c), aob
+ aoc
6. aob + boc + coa
8. 4a 3c, 12a c
0
10. Ib + cl, Ibl
12. la cl, lallcl
+
Icl
0
= uow with u
0/= 0 imply that v
= w?
inequality. and the parallelogram equality for the above
a and b.
17. Prove the parallelogram equality.
18. (Triangle inequality) Prove (7). Hint. Use (3) for
la + bl and (6) to prove the square of (7). then take
roots.
119-221 WORK
Find the work done by a force p acting on a body if the
body is displaced from a point A to a point B along the
straight segment AB. Sketch p and AB. (Show the details
of your work.)
19. p = [8. -4. 11], A: (I. 2. 0). B: (3, 6. 0)
20. p = [2. 7. -4], A: (3. I. m. B: (0. 2, 0)
21. p = [5, -2, I], A: (4, 0, 3), B: (6, 0, 8)
22. p = [4.3. 6J, A: (5. 2. 10). B: (1, 3. l)
23. Why is the work in Prob. 19 zero? Can work be
negative? Explain.
24. Show that the work done by the resultant of p and q
in a displacement from A to B is the sum of the work
done by each force in that displacement.
25. Find the work W = pod if d = 2i and p = i, i + j,
j, -i + j and sketch a figure similar to Fig. 177.
I
ANGLE BETWEEN VECTORS.
ORTHOGONALITY
Let a = ll, I, I], b
angle between:
= [2.3.1], c =
26. a, b
27. b, c
29. a + b, c
[-I, 1,0]. Find the
18. a-c,b-c
30. a, h +
C
31. (Planes) Find the angle between the planes
x + y + .;; = 1 and 2x - J + 2;: = O.
34. (Addition law) Obtain
cos (a - (3) = cos a cos {3
+ sin a sin {3
by using a = [cos a, sin a]. b = [cos {3, sin {3]. where
o ~ a ~ (3 = 27f.
35. (Parallelogram) Find the angles if the sides are [5, 0]
and [I. 21-
IS. Prove the Cauchy-Schwarz inequality.
16. Verify the Cauchy-Schwarz inequality, the triangle
126--30
33. (Triangle) Find the angles of the triangle with vertices
[0, 0, 0], [1, 2, 3], [4, -1, 3].
0
13. What laws do Probs. I. 3.4, 7, 8 illustrate?
14. Does uov
32. (Cosine law) Deduce the law of cosines by using
vectors a, b, and a-b.
36. (Distance) Find the distance of the plane
5x + 2y + z = 10 from the origin.
13740 I
COMPONENTS IN THE DIRECTION
OF A VECTOR
Find the component of a in the direction of b.
37. a
=
[I. 1,3]. b
=
[0, O. 5]
38. a = [2. O. 6], b = [3. 4, - 11
39. a
=
[0.4, -3]. b
= [0.4,3]
40. a
=
[-1,2,0]. b
=
[1, -2,0]
41. Cnder what condition will the projection of a in the
direction of b equal the projection of b in the direction
of a?
42. TEAM PROJECT. Orthogonality is particularly
important. mainly because of the use of orthogonal
coordinates. such as Cartesian coordinates, whose
"natural basis" (9). Sec. 9.1. consists of three
0l1hogonal unit vectors.
(a) Show that a = [2. -2.4]. b = [0.8.4],
c = [-20. -4. 8] are orthogonaL
(b) For what values of al are a
b = [3.4, -IJ 0l1hogonal?
=
(c) Show that the straight lines 4x
5x - 10)" = 7 are orthogonaL
[aI- 2, 0] and
+
2y = 1 and
(d) Find all unit vectors a = [a lo a2] in the plane
orthogonal to [4, 31.
(e) Find all vectors orthogonal to a = [2. l. 0]. Do
they fonn a vector space?
(I) For what c are the plane!> 4x - 2)" + 3.;; = 6 and
2x - cy + 5::: = 1 orthogonal?
(g) Under what condition will the diagonals of a
parallelogram be orthogonal? (Prove your answer.)
(h) What is the angle between a light ray and its
reflection in three orthogonal plane minors (known as
a "corner reflector")"?
(i) Discuss further applications in physics and
geometry in which orthogonality plays a role.
377
SEC. 9.3
Vector Product (Cross Product)
9.3
Vector Product (Cross Product)
The dot product in Sec. 9.2 is a scalar. We shall see that in some applications, for instance,
in connection with rotations, we shall need a product that is again a vector:
DEFINITION
Vector Product (Cross Product, Outer Product) of Vectors
The vector product (also called cross product or outer product) a x b (read "a
crOss b"') of two vectors a and b is the vector
v=axb
as follows. If a and b have the same or opposite direction, or if a = 0 or b = 0,
then v = a x b = O. In any other case v = a x b ha~ the length
(1)
Ivl = la x bl = lallbl sin 'Y.
This is the area of the blue parallelogram in Fig. 183. 'Y is the angle between a and
b (as in Sec. 9.2). The direction of v = a x b is perpendicular to both a and band
such that a, b, v, in this order, form a right-hallded triple as in Figs. 183-185
(explanation below).
In components, let a = [aI' a2' a3] and b = [b 1, b 2, b3]. Then v
has the components
=
[Vb V2, V3] = a x b
(2)
Here the Cartesian coordinate system is right-handed, as explained below (see also
Fig. 186). (For a left-handed system, each component of v must be multiplied by -1.
Derivation of (2) in App. 4.)
Right-Handed Triple. A triple of vectors a, b, v is right-handed if the vectors in the
given order assume the same sort of orientation as the thumb, index finger, and middle
finger of the right hand when these are held as in Fig. 184. We may also say that if a is
rotated into the direction of b through the angle 'Y « 'IT), then v advances in the same
direction as a right-handed screw would if turned in the same way (Fig. 185).
a
Fig. 183.
a
Vector product
Right-handed
triple of vectors a, b, v
Fig. 184.
Fig. 185.
Right-handed
screw
CHAP. 9
378
Vector Differential Calculus. Grad, Div, Curl
z
k
~j
x
/
~
:k
x
y
y
z
(b) Left-handed
(a) Right-handed
Fig. 186. The two types of Cartesian coordinate systems
Right-Handed Cartesian Coordinate System. The system is called right-handed if
the corresponding unit vectors i, j, k in the positive directions of the axes (see Sec. 9.1)
form a right-handed triple as in Fig. 186a. The system is called left-handed if the sense
of k is reversed, as in Fig. 186b. In applications, we prefer right-handed systems.
How to Memorize (2).
(2) can be written
If you know second- and third-order determinants, you see that
(2*)
and v = [VI, V 2 , V3] = VIi + V~ + V3k is the expansion of the following symbolic
determinant by its first row. (We call the determinant "symbolic" because the first row
consists of vectors rather than of numbers.)
j
(2**)
v= a x b =
k
a}
For a left-handed system the determinant has a minus sign in front.
E X AMP L E 1
Vector Product
For the vector product v = a x b of a = [I, I, OJ and b = [3, 0, 0] in right-handed coordinates we obtain
from (2)
VI
= 0,
V3
= I ·0 - 1·3 = -3.
We confirm this by (2**):
j
k
~I k
v=axb=
3
o
= -3k = [0,0, -3].
o
To check the result in this simple case, sketch a, b, and v. Can you see that two vectors in the .xy-plane must
always have their vector product parallel to the z-axis (or equal to the zero vector)?
•
SEC 9.3
379
Vector Product (Cross Product)
E X AMP l E 2
Vector Products of the Standard Basis Vectors
i x j
=
k,
jxk=i,
k x i = j
k x j = -i,
i x k = -j.
(3)
j x i = - k,
•
We shall use this in the next proof.
THEOREM 1
General Properties of Vector Products
(a) For every scalar I,
(4)
(fa) x b
= lea x b) = a x (fb).
(b) Cross multiplication is distributive with respect to vector addition; that is.
x b)
+
(a x c),
b) x c = (a x c)
+
(b x c).
(0')
a x (b
({3)
(a
(5)
+
+ c) = (a
(c) Cross multiplication is not cOllllllutative bllt alltico11l11lutative; that is,
b x a
(6)
a
I
bXRf
Fig. 187.
Anticommutativity
of cross
multiplication
PROOF
=
-l3
x b)
(Fig. 187).
(d) Cross multiplication is not associative; that is, in general,
(7)
a x (b x C)
"* (a x
b) x c
so that the parentheses cannot be omitted.
(4) follows directly from the definition. In (50'), formula (2*) gives for the first component
on the left
I=
a3
b3
+
a2(b 3
+ C3) - a3(b 2 + C2)
C3
By (2*) the sum of the two determinants is the fIrst component of (a x b) + (a x C), the
right side of (Sa). For the other components in (50') and in (5{3), equality follows by the
same idea.
Anticommutativity (6) follows from (2**) by noting that the interchange of Rows 2
and 3 multiplies the determinant by -I. We can confirm this geometrically if we set
a x b = v and b x a = w; then Ivl = Iwl by (1), and for b, a, w to form a right-handed
triple, we must have w = -v.
Finally. i X (i x j) = i x k = -j, whereas (i x i) x j = 0 X j = 0 (see Example
..
2). This proves (7).
380
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Typical Applications of Vector Products
E X AMP L E 3
Moment of a Force
In mechanics the moment III of a force p about a point Q is defined as the product III = Ipld, where d is the
(perpendicular) distance between Q and the line of action L of p (Fig. 188). If r is the vector from Q to any
point A on L, then d = 11'1 sin l' (Fig. 188) and
111
Since 1'is the angle between
= Irllpl sin 1'.
and p, we see from (I) that
I'
m =
(8)
I' X
III
= Ir x pI. The vector
P
*
is called the moment \'ector or "ector moment of p about Q. Its magnitude is 111. If m
0, its direction is
that of the axis of the rotation about Q that p has the tendency to produce. This axis i~ perpendicular to both
I' and p.
•
L
Qy-_ _
p~
l'
y
\
d \
\
Moment of a force p
Fig. 188.
E X AMP L E 4
Moment of a Force
Find the moment of the force p in Fig. 189 about the center Q of the wheel.
Solutioll.
Introducing coordinates as shown in Fig. 18<), we have
p = [1000 cos 30°,
1000 sin 30°,
0] = [866,
500,
r = [0,
0],
1.5,
0].
(Note that the center of the wheel is at y = -1.5 on the y-axis.) Hence (8) and (2**) give
j
m=rxp=
k
0
1.5
0 = Oi - OJ +
866
SOU
0
18~6
5
1. k = [0,0, -1299].
500
1
This moment vector is normal (perpendicular) to the plane of the wheel; hence it has the direction of the axis
of rotation about the center of the wheel that the fon;e has the tendency to produce. m points in the negative
;:-direction, the direction in which a right-handed screw would advance if turned in that way.
•
y
Ipl
= 1000 Ib
x
Q
Fig. 189.
Moment of a force p
SEC. 9.3
Vector Product (Cross Product)
E X AMP L E 5
381
Velocity of a Rotating Body
A rotation of a rigid body B In space can be simply and uniquely described by a vector w as follows. The
direction of w i, that of the axis of rotation and such that the rotation appears clockwise if one looks from the
initial point of w to its terminal point. The length of w is equal to the angular speed w (> 0) of the rotation.
thai is. the linear (or tangential) speed of a point of B divided by its dislance from the axis of rotation.
Let P be any point of Band d its distance from the axis. Then P has the speed wd. Let r be the position
vector of P referred to a coordinate system with origin 0 on the axis of rotation. Then d = Irl sin 'Y. where 'Y is
the angle between wand r. Therefore.
wd = Iwllrl sin 'Y = Iw x rl.
From this and the definition of vector product we see that the velocity vector v of P can be represented in the
form (Fig. 190)
v = w x r.
(9)
•
This simple formula is useful for determining v at any point of B.
d
Fig. 190.
Rotation of a rigid body
Scalar Triple Product
The most important product of vectors with more than two factors is the scalar triple
product or mixed triple product of three vectors a. b, c. It is denoted by (a b c) and
defined by
\a
(10*)
b
c)
= ao(b x c).
Because of the dot product it is a scalar. In terms of components a = [at. {/2, l/3].
b = [b 1 • b 2 , b3 ]. C = [Cb c2, C3] we can write it as a third-order detenninant. For this we
set b x c = v = [VI. V 2 , V3]' Then from the dot product in components [formula (2) in
Sec. 9.2] and from (2*) with band c instead of a and b we first obtain
The sum on the right is the expansion of a third-order determinant by its first row. Thus
(10)
(a
b
c)
= ao(b x c) =
b1
CHAP. 9
382
Vector Differential Calculus. Grad, Div, Curl
The most important properties of the scalar triple product are as follows.
THEOREM 2
Properties and Applications of Scalar Triple Products
(a) In (10) the dot and cross can be interchanged:
(a
(11)
b
= ao(b x c) = (a x b)oc.
c)
(b) Geometric interpretation. The absolute value I(a b c)1 of (10) is the
volume of the parallelepiped (oblique box) with a, b. c as edge vectors (Fig. 191).
(c) Linear independence. Three vectors in R3 are linearly independent
only ~f their scalar triple product is not zero.
PROOF
if and
(a) Dot multiplication is commutative. so that by (10)
(a x b)oc = coCa x b) =
al
a2
a3
b1
b2
b3
From this we obtain the determinant in (10) by interchanging Rows 1 and 2 and in the
result Rows 2 and 3. But this does not change the value of the determinant because each
interchange produces a factor - I, and (- 1)( -}) = 1. This proves (11).
(b) The volume of that box equals the height h = lallcos 'YI (Fig. 191) times the area
of the base, which is the area Ib x cl of the parallelogram with sides b and c. Hence the
volume is
lallb
cllcos 'YI
X
= lao (b x c)1
(Fig. 191)
as given by the absolute value of (II).
(c) Three nonzero vectors, if we let their initial points coincide, are linearly independent
if and only if they do not lie in the same plane (or do not lie on the same straight line).
This happens if and only if the triple product in (b) is not zero, so that the independence
criterion follows. (The case that one of the vectors is the zero vector is trivial.)
•
I
I
bxc
I
I
1
I
:h/~-----/
/1
// 1
Fig. 191.
E X AMP L E 6
b
Geometric interpretation of a scalar triple product
Tetrahedron
A tetrahedron is determined by three edge vectors a, b, c, as indicated in Fig. 192. Find its volume when
a
=
[2. O. 3]. b
Soilltioll.
=
[0.4. I]. c
=
[5.6. OJ.
The volume Vof the parallelepiped with these vectors as edge vectors is the absolute value of the
scular triple product
SEC. 9.3
383
Vector Product (Cross Product)
(a
b
b
c) =
2
o
0
4
5
6
3
:1 =
-12 - 60
=
-72.
o
Hence V = 72. The minus sign indicates that if the coordinates are right-handed, the triple a, b, c is left-handed.
The volume of a tetrahedron is ~ of that of the parallelepiped (can you prove it?). hence 12.
Can you sketch the tetrahedron, choosing the origin as the common initial point of the vectors? What are the
coordinates of the four vertices?
•
Fig. 192.
Tetrahedron
This is the end of vector algebra (in space R3 and in the plane). Vector calculus
(differentiation) begins in the next section.
[ 1-20
VECTOR PRODUCT, SCALAR TRIPLE
PRODUCT
1
With respect to right-handed Cartesian coordinates, let
a = [1. 2. 0]. b = [3. -4,0], c = [3.5.2]. d = [6,2, -3].
Showing details. find:
1. a x b, b x a
2. a x c, la xci, aoc
3. (a + b) x c, a x c + b x c
4. (c + d) x d, c x d
5. 2a x 3b, 3a x 2b, 6a x b
6.bxc+cxb
7. ao(b x c), (a x bloc
8. (a
9. (a
10. (a
+
x
x
b) x (b
18. (a + b
19. (a - c
20. (4a
3b
(12)
la x bl = Y(aoa)(bob) - (a ob)2
(13)
b x (c x d) = (bod)c - (boc)d
(14)
(a x b) x (c x d)
=
(15)
(a
b
d)c - (a
b
c)d
(a x b)o(c x d) = (aoc)(bod) - (aod)(boc)
(a
b
c) = (b
(16)
= -(c
b) x c, a x (b x c)
c
a)
b
= (c
a) = -(a
a
b)
c
b)
11. d x c, Id x cI. IC x dl
12. (a + b) x (c + d)
13. a x (b + c - dl
[25-281 MOMENT OF A FORCE
Find the moment vector m and the moment 111 of a force p
about a point Q when p ads on a line through A.
14. (i
25. P = L4, 4, 0], Q: (2, 1,0), A: (0.3,0)
j
k), (i
k
j)
..:1'\1
U).ltii
U
Formula (15) is called Lagrange's identity.
+ a)
b)o(c x d). (b x a)o(d x c)
15. (i + j j + k k + i)
16. (b x Clod, bo(c x d)
.,. til
each side of (13) then equals [-b2C2dl' b1C2dl' 0]. and
give reasons why the two sides are then equal in any
Cartesian coordinate system. For (14) and (15) use (13).
27. P = [1,2,3]. Q: (0, I, 1), A: (1. 0, 3)
fL
~
.n
b + c c + d)
b - c c). (a b
lc). 24(b
26. P = [0. O. 5]. Q: (3. 3. 0), A: (0, O. 0)
c
c)
a)
21. What properties of cross multiplication do Probs. I, 3,
8, 10 illustrate?
22. Give the details of the proofs of (4) and (5).
23. Give the details of the proofs of (6) and (11).
24. TEAM PROJECT. Useful Fonnulas for Two and
More Vectors. Prove (12)-06). which are often useful
in practical work. and illustrate each formula with two
examples. Hillts. For (13) choose Cartesian coordinates
such that d = [c11 , 0, 0] and c = [Cl, C2. 0]. Show that
28. p = [4. 12.8]. Q: (3.0,5). A: (4. 3. 7)
29. (Rotation) A wheel is rotating about the y-axis with
angular speed w = 10 sec-I. The rotation appears
clockwise if one looks from the origin in the positive
y-direction. Find the velocity and speed at the point
(4, 3, 0).
30. (Rotation) What are the velocity and speed in Prob.
29 at the point (4. 2. -2) if the wheel rotates about the
line y = x, Z = 0 with w = 5 sec-I.
GEOMETRIC APPLICATIONS
31. (Parallelogram) Find the area if the vertices are (2, 2),
(9. 2), (10, 3), (3, 3).
CHAP. 9
384
Vector Differential Calculus. Grad, Div, Curl
32. (Parallelogram) Find the area if the vertices are (3, 9, 8),
(0, 5, 1), (-1, -3, -3), (2, 1, 4).
33. (Triangle) Find the area if the vertices are (1, 0, 0).
(0. 1. 0), (0. O. 1).
34. (Triangle) Find the area if the vertices are (4. 6. 5).
(4. 9, 5), (8.6. 7).
35. (Plane) Find a nonnal vector and a representation of the
plane through the points (4, 8, 0), (0, 2, 6), (3, 0, 5).
36. (Plane) Find the plane through (2, I, 3), (4, 4. 5),
(I, 6, 0).
9.4
37. (Parallelepiped) Find the volume of the parallelepiped
detennined by the vertices (1, 1, 1), (4, 7, 2). (3, 2, 1),
(5, 4, 3).
38. (Tetrahedron) Find the volume of the tetrahedron with
vertices lO, 2, 1), (4, 3, 0), (6, 6, 5), (4, 7, 8).
39. (Linear dependence) For what c are the vectors [9, 1, 2J.
[-I, c, 5]. [4, c. 5] linearly dependent?
40. WRITING PROJECT. Applications of Cross
Products. Summarize the most important applications
we have discussed in this section and give a few simple
examples. No proofs.
Vector and Scalar Functions and Fields.
Derivatives
We now begin with vector calculus. This calculus concerns two kinds offunctions, namely,
vector functions, whose values are vectors
depending on the points P in space, and scalar functions, whose values are scalars
f
=
f(P)
depending on P. Here, P is a point in the domain of definition, which in applications is
a (three-dimensional) domain or a surface or a curve in space. We say that a vector function
defines a vector field, and a scalar function defines a scalar field in that domain or on
that surface or curve. Examples of vector functions are shown in Figs. 193-196. Examples
of scalar fields are the temperature field in a body or the pressure field of the air in the
earth's atmosphere. Vector and scalar functions may also depend on time t or on some
other parameters.
Notation.
write
If we introduce Cartesian coordinates x, y. z, then instead of v(P) we can also
"Fig. 193. Field of tangent
vectors of a curve
Fig. 194.
Field of normal
vectors of a surface
SEC. 9.4
Vector and Scalar Functions and Fields. Derivatives
385
but we keep in mind that components depend on the choice of a coordinate system, whereas
a vector field that has a physical or a geometric meaning should have magnitude and
direction depending only on P, not on that choice. Similarly for the value of a scalar field
f(P) = f(x, y, z).
E X AMP L E 1
Scalar Function (Euclidean Distance in Space)
The distance f(P) of any point P from a fixed point Po in space is a scalar function whose domain of definition
is the whole space. f(P) defines a scalar field in space. If we introduce a Cartesian coordinate system and Po
has the coordinates xo, Yo, Zo, then f is given by the well-known formula
f(P)
=
f(x, y, z)
=
Vex -
.\'0)2
+
(y - YO)2
+ (z -
-::.0)2
where x, y, z are the coordinates of P. If we replace the given Cartesian coordinate system with another such
system by translating and rotating the given system, then the values of the coordinates of P and Po will in general
change, but J(P) will have the same value as before. Hence f(P) is a scalar function. The direction cosines of
the straight line through P and Po are not scalars because their values depend on the choice of the coordinate
system.
•
E X AMP L E 2
Vector Field (Velocity Field)
At any instant the velocity vectors v(P) of a rotating body B constitute a vector field, called the velocity field
of the rotation. If we introduce a Cartesian coordinate system having the origin on the axis of rotation, then (see
Example 5 in Sec. 9.3)
(1)
vex, y, z) =
w
xr =
w
X
[x, y, zl = w
x (xi + yj + zk)
where x, y, z are the coordinates of any point P of B at the instant under consideration. If the coordinates are
such that the z-axis is the axis of rotation and w points in the positive z-direction, then w = wk and
k
v =
0
o
w
x
y
z
= w[ -y,
x, 0] = w(-yi
+ xj).
An example of a rotating body and the corresponding velocity field are shown in Fig. 195.
•
I
I
I
I
I
--~-=~-1~
K
=-=~~
I
I
c0
Fig. 195.
E X AMP L E 3
Velocity field of a rotating body
Vector Field (Field of Force, Gravitational Field)
Let a particle A of mass M be fixed at a point Po and let a particle B of mass m be free to take up various
positions P in space. Then A attracts B. According to Newton's law of gravitation the corresponding gravitational
force p is directed from P to Po, and its magnitude is proportional to IIr 2, where r is the distance between
P and Po, say,
(2)
c= GMIIl.
386
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Here G = 6.67' 10-8 cm3 /(gm· sec2 ) is the gravitational constant. Hence p defines a vector field in space. If
we introduce Cartesian coordinates such that Po has the coordinates xo. Yo. Zo and P has the coordinate~ x. y. z.
then by the Pythagorean theorem.
(~
Assuming that r
0).
> 0 and introducing the vector
r
= [x -
= Ix -
xo, Y - .1'0' :: - ::01
xo)i
+
{y - yo)j
+ {z -
:::o)k.
we have Irl = ,.. and {- IIrjr is a unit vector in the direction of p; the minus sign indicates that p is directed
from P to Po (Fig. 196). From this and (2) we obtain
p = Ipl
( I)
- - r
,.
= - -
c r
= -c
--3-) -
r3
=
[ -c
x-xo
--
y - Yo
,.
-c
r3
--3-'
(3)
x - xo .
r
c
y - Yo .
c
--3-J -
r
::: r
:::0
--3-
b..
•
This vector function describes the gravitational force acting on B
~p
t
---
00.....-.
t
Fig. 196.
Gravitational field in Example 3
Vector Calculus
We show next that the basic concepts of calculus, namely. convergence. continuity. and
differentiability, can be defined for vector functions in a simple and natural way. Most
imp0l1ant here is the derivative.
Convergence. An infinite sequence of vectors
if there is a vector a such that
(4)
lim la(n) - al
n_x
3(n)'
n
= L 2..... is said to converge
= O.
a is called the limit vector of that sequence. and we write
(5)
lim
a(n)
=
a.
n~oo
Cartesian coordinates being given, this sequence of vectors converges to a if and only
if the three sequences of components of the vectors converge to the corresponding
components of a. We leave the simple proof to the student.
SEC. 9.4
387
Vector and Scalar Functions and Fields. Derivatives
Similarly, a vector function v(t) of a real variable t is said to have the limit 1 as t
approaches to, if vet) is defined in some neighborhood of to (possibly except at to) and
(6)
o.
11 =
lim Iv(t) -
t--.+to
Then we write
lim v(t)
(7)
t---+to
=L
Here, a neighborhood of to is an interval (segment) on the t-axis containing to as an interior
point (not as an endpoint).
Continuity. A vector function v(t) is said to be continuous at t
some neighborhood of to (including at to itself!) and
(8)
lim vet)
t---+to
= to if it is defined in
= veto).
If we introduce a Cartesian coordinate system, we may write
Then v(t) is continuous at to if and only if its three components are continuous at to.
We now state the most important of these definitions.
DEFINITION
Derivative of a Vector Function
A vector function v(t) is said to be differentiable at a point t if the following limit
exists:
(9)
,
.
v (t) = hm
At~O
v(t + t1t) - v(t)
A
ut
This vector v'(t) is called the derivative of v(t). See Fig. 197.
Fig. 197.
Derivative of a vector function
In components with respect to a given Cartesian coordinate system.
(10)
v' (t) = [v~(t),
v~(t),
v~(t)].
Hence the derivative v' (t) is obtained by differentiating each component separately. For
instance, if v = [t, t 2 , 0], then v' = [1, 2t, 0].
388
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Equation (10) follows from (9) and conversely because (9) is a "vector form" of the
usual formula of calculus by which the derivative of a function of a single variable is
defined. [The curve in Fig. 197 is the locus of the terminal points representing v(t) for
values of the independent variable in some interval containing 1 and 1 + At in (9)]. It
follows that the familiar differentiation rules continue to hold for differentiating vector
functions, for instance,
=
(cv)'
(0
+ v)' =
cv'
(c
+
0'
constant).
v'
and in particular
(O'v)' = u'·v
(11)
(12)
(13)
(0 X
(0
v
w)' = (0'
+
v)' = u' x v
v
w)
+
(0
u'v'
+0
v'
X
v'
w)
+ (0
v
w').
The simple proofs are left to the student. In (12), note the order of the vectors carefully
because cross multiplication is not commutative.
E X AMP L E 4
Derivative of a Vector Function of Constant Length
Let vet) be a vector function whose length is constant. say, Iv(t)1 = c. Then Ivl 2 = v·v = c 2 , and
(v· v)' = 2v· v' = 0, by differentiation [see (11)]. This yields the following result. The derivative of a vector
•
function vet) of constant length is either the zero vector or is perpendicular to vet).
Partial Derivatives of a Vector Function
Our present discussion shows that partial differentiation of vector functions of two or more
variables can be introduced as follows. Suppose that the components of a vector function
are differentiable functions of n variables tlo . . . , tn' Then the partial derivative of v
with respect to 1m is denoted by av/atm and is defined as the vector function
av
Similarly. second partial derivatives are
and so on.
E X AMP L E 5
Partial Derivatives
ilr
ilt1
=
-a sin t1 i
+ a cos t1 j
and
ilr
-=k.
ilt2
•
Various physical and geometric applications of derivatives of vector functions will be
discussed in the next sections and in Chap. 10.
SEC. 9.5
~1~
SCALAR FIELDS
7. (Isobars) For the pressure field f(x. y) = 9x2 + 16y2
find the isobars f(x, y) = const, the pressure at (4, 3),
(- 2, 2), (I, 5), and the regIOn in which the pressure is
between 4 and 16.
S. CAS PROJECT. Scalar Fields in the Plane. Sketch
or graph isotherms of the following fields and describe
what they look like.
(a) x 2 -
4x -
(b) x 2y - y3/3
y2
(c) cos x sinh)
(e) eX sin y
(d) sin x sinh y
(f) e 2x cos 2)'
(g) x4 _ 6x2y2
(h) x 2 -
+
)'4
f
= z; -
13. f
= 4x
11.
Determine the isotherms (curves of constant temperature
T) of the temperature fields in the plane given by the
following scalar functions. Sketch some isotherms.
1. T = xy
2. T = 4x - 3)
3. T = y2 - x 2
4. T = x/(x 2 + ."2)
2
5. T = y/(x + y2)
6. T = x 2 - y2 + 8y
115-201
Vx 2 + y2
+
12.
f
+
x 2j
= yi - xj
=
Z
VECTOR FIELDS
Sketch figures similar to Fig. 196.
15. v = i - j
16. v
17. v
19. v
= 4.\'2 -
3)' - 5z
i
IS. v
=
yi
=
xi
+ xj
+ yj
20. v = (x - y)i + (x + v)j
121-251
DIFFERENTIATION
21. Prove (11)-(13). Give two examples for each formula.
22. Find the first and second derivatives of
[4 cos t, 4 sin t, 2tl
23. Find the first partial derivatives of [4x 2, 9z 2, xyz] and
[yz, zx, .I.}'].
2x _ y2
I
9-1.::'
SCALAR FIELDS IN SPACE
What kind of surfaces are the level surfaces f(x,)" z) = cOllst?
9. f = x 2 + )"2 + 4~2
10. f = x 2 + 4y2
9.5
389
Curves. Arc Length. Curvature. Torsion
24. Find the first partial derivatives of
[sin x cosh y, cos x sinh yJ and [eX cos)" eX sin y].
25. WRITING PROJECT. Differentiation of Vector
Functions. Summarize the essential ideas and facts and
gi ve examples of your own.
Curves. Arc Length. Curvature. Torsion
A major application of vector calculus concerns curves (this section) and surfaces (Sec.
10.5) and their use in physics and geometry. This field is called differential geometry.
It plays a role in mechanics, computer-aided and traditional engineering design, geodesy
and geography, space travel, and relativity theory (see Refs. [GR8], [GR9] in App. I).
Curves C in space may occur as paths of moving bodies. This and other applications
motivate parametric representations with parameter t, which may be time or something
else (see Fig. 198)
(1)
r(t) = [x(t),
Fig. 198.
y(t),
z(t)] = x(t)i
+
y(t)j
+ z(t)k.
Parametric representation of a curve
390
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Here x, y, z are Cartesian coordinates (the usual rectangular coordinates; see Sec. 9.1).
To each value t = to there corresponds a point of C with position vector r(to), that is,
with coordinates x(to), y(to). :(to).
Parametric representations (1) have a key advantage over representations of a curve C
in terms of its projections into the x:v-plane and into the xz-plane, that is,
y = f(x).
(2)
z
=
g(x)
(or by a pair of equations with y or with z as the independent variable). The advantage is
that in (I) the coordinates x, y, : play the same role: all three are dependent variables.
Moreover, the sense of increasing t, called the positive sense on C, induces an orientation
of C, a direction of travel along C. The sense of decreasing t is then called the negative
sense on C, given by (I).
EXAMPLE 1
Circle
The circle.£2 + ... 2 = 4, ;: = 0 in the :\~\'-plane with center 0 and radius 2 can be represented parametrically by
r(t) = [2 cos t. 2 sin t, 0]
or simply by
r(t) = [2 cos t. 2 sin t]
(Fig. 199)
where 0 ;;; t ;;; 21T. Indeed. x 2 + y2 = (2 co, 1)2 + (2 sin t)2 = 4(cos2 t + sin 2 t) = 4. For t = 0 we have
r(O) = [2, 0], for t = ~1T we get r(~1T) = [0. 2]. and so on. The positive sense induced by this representation
is the counterclockwise sense.
if we replace t with t* =
t, we have t = -t* and get
r*(t*) = [2 cos (-t*). 2 sin (-t*)) = [2 cos t*, -2 sin t*].
•
This has reversed the orientation. and the circle is now oriented clockwise.
E X AMP L E 2
Ellipse
The vector function
r(t) = [a cos t.
(3)
bsint,
0] = acost i + bsint j
(Fig. 200)
represents an ellipse in the \}'-plane with center at the origin and principal axes in the direction of the x and y
axes. In fact, since cos2 t + sin 2 t = 1, we obtain from (3)
z = O.
If b
=
•
a, then (3) represents a circle of radius a.
"~\~~
-~y\
(t=
Fig. 199.
~1I)T
Circle in Example 1
(t
Fig. 200.
=
~1I)1
(t
= 0)
Ellipse in Example 2
SEC. 9.5
391
Curves. Arc Length. Curvature. Torsion
E X AMP L E 3
Straight Line
A straight line L through a point A with position vector a in the direction of a constant vector b (,ee Fig. 201)
can be represented parametrically in the form
(4)
If b is a unit vector. its components are the direction cosines of L. In this case. It I mea,ures the distance of the
points of L from A. For instance. the straight line in the xv-plane through A: (3, 2) having slope l is (sketch it)
r(t)
~
[3,
2,
+
0]
1,
t[l,
0]
~
[3
+ t, 2 +
t,
•
0].
A
z
/
------
y
X
Fig. 201.
a
Parametric representation of a straight line
A plane curve is a curve that lies in a plane in space. A curve that is not plane is called
a twisted curve. A standard example of a twisted curve is the following.
E X AMP L E 4
Circular Helix
The twisted curve C represented by the vector function
(5)
r(t)
~
[a cos t.
a sin t.
etl
~
a cos t i
+
a sin t j
+
et k
(c'*
0)
is called a circlilar helix. It lies on the cylinder x 2 + y2 = a 2 . If c > O. the helix is shaped like a right-handed
screw (Fig. 202). If c < 0, it looks like a left-handed screw (Fig. 203). If c = 0, then (5) is a circle.
•
y
I
I
I
/
.P--
../
/
y
x
Fig. 201.
Right-handed circular helix
Fig. 203.
Left-handed Circular helix
A simple curve is a curve without multiple points, that is, without points at which the
curve intersects or touches itself. Circle and helix are simple. Figure 204 shows curves
that are not simple. An example is [sin 2t, cos t, 0]. Can you sketch it?
An arc of a curve is the portion between any two points of the curve. For simplicity,
we say "curve" for curves as well as for arcs.
392
CHAP. 9
Vector Differential Calculus. Grad. Div. Curl
Fig. 204.
Curves with multiple points
Tangent to a Curve
The next idea is the approximation of a curve by straight lines, leading to tangents and
to a definition of length. Tangents are straight lines touching a curve. The tangent to a
simple curve C at a point P of C is the limiting position of a straight line L through P
and a point Q of C as Q approaches P along C. See Fig. 205.
If C is given by ret), and P and Q cOlTespond to T and t + b..t, then a vector in the
direction of L is
I
(6)
[ret
.:1t
+
Ilt) - r(t)].
In the limit this vector becomes the derivative
,
r (t)
(7)
= lim
:,t~O
I
A
ul
Ir(t
+
b..t) - r(t)l,
provided r(t) is differentiable, as we shall assume from now on. If r' (t) =F 0, we call r' (t)
a tangent vector of C at P because it has the direction of the tangent. The cOlTesponding
unit vector is the unit tangent vector (see Fig. 205)
(8)
u=
1
,
!r'! r.
Note that both r' and u point in the direction of increasing t. Hence their sense depends
on the orientation of C. It is reversed if we reverse the orientation.
It is now easy to see that the tangent to C at P is given by
(9)
q(w)
= r + wr'
(Fig. 206).
This is the sum of the position vector r of P and a multiple of the tangent vector r' of C
at P. Both vectors depend on P. The variable w is the parameter in (9).
L
o
Fig. 205.
Tangent to a curve
Fig. 206.
Formula (9) for the tangent to a curve
SEC. 9.5
393
Curves. Arc Length. Curvature. Torsion
E X AMP L E 5
Tangent to an Ellipse
Find the tangent to the ellipse ~x2 + y2
=
1 at P:
CV2, 11V2).
Solution.
r' (t)
=
Equation (3) with semi-axes a = 2 and b = 1 gives r(t) = [2 cos t, sin t]. The derivative
[-2 sin t. cos t]. Now P corresponds to t = 7T/4 because
r(7T/4)
Hence r' (7T/4) = [ - V2,
q(w) =
=
= [V2.
[2 cos (7T/4). sin (7714)]
IS
11V2].
I/V2]. From (9) we thus get the answer
[V2, 11V2] + 1\'[-V2, 11V2]
=
[V2(1 -
11'),
(lNi)(l t- 11')].
•
To check the result, sketch or graph the ellipse and the tangent.
Length of a Curve
We are now
broken lines
let ret), a ~
interval a ~
ready to define the length I of a curve. I will be the limit of the lengths of
of n chords (see Fig. 207, where n = 5) with larger and larger n. For this,
t ~ b, represent C. For each n = I, 2, ... we subdivide ("partition") the
t ~ b by points
where
This gives a broken line of chords with endpoints r(to), ... , r(tn). We do this arbitrarily
but so that the greatest ILlt.,nl = Itm - t m-ll approaches 0 as n ~ co. The lengths
II' 12 , • • • of these chords can be obtained from the Pythagorean theorem. If ret) has a
continuous derivative r' (t), it can be shown that the sequence II' 12 , ••• has a limit, which
is independent of the particular choice of the representation of C and of the choice of
subdivisions. This limit is given by the integral
(10)
I is called the length of C, and C is called rectifiable. Formula (10) is made plausible in
calculus for plane curves and is proved for curves in space in [GR8] listed in App. 1. The
practical evaluation of the integral (10) will be difficult in general. Some simple cases are
given in the problem set.
Arc Length
5
of a Curve
The length (10) of a curve C is a constant, a positive number. But if we replace the fixed
b in (10) with a variable t, the integral becomes a function of t, denoted by s(t) and called
the arc length function or simply the arc length of C. Thus
t
(11)
s(t)
=
I Vr or
~ rtI
dt-
a
Fig. 207.
Length of a curve
~n·
394
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Here the variable of integration is denoted by t because t is now used in the upper limit.
Geometrically, s(to) with some to > a is the length of the arc of C between the points
with parametric values a and to. The choice of a (the point s = 0) is arbitrary; changing
a means changing s by a constant.
Linear Element ds.
(12)
ds
( -dt
If we differentiate
)2 = -dr • -dr
dt
dt
= Ir
U 1) and square, we have
'2
(t)1 = (-dx )2 + (-dy )2 + (-d::. )2.
dt
dt
dt
It is customary to write
dr
(13*)
[dx, dy, d;::] = dxi
=
+
dyj
+
d;::k
and
(13)
ds is called the linear element of C.
Arc Length as Parameter. The use of sin (1) instead of an arbitrary t simplifies various
formulas. For the unit tangent vector (8) we simply obtain
(14)
U(s)
= r' (s).
Indeed, Ir' (s)1 = (ds/ds) = I in (12) shows that r' (s) is a unit vector. Even greater
simplifications due to the use of s will occur in curvature and torsion (below).
E X AMP L E 6
Circular Helix. Circle. Arc Length as Parameter
The helix r(l) = [a cos I. a sin I. el] in (5) has the derivative r' (I) = [-a sin t. a cos t. d. Hence r' • r' = a 2 + e 2•
a constant. which we denote by K2. Hence the integrand in ( II) is constant. equal to K. and the integral is s = Kt.
Thus I = 51K. so that a representation of the helix with the arc length s as parameter is
(15)
r*(s)
=
s )
r( K
= [
a cos
K5
asin
5
cSJ
K , K'
K =
Va
2
+ c2 .
A circle is obtained if we seI c = O. Then K = a. t = sla. and a representation with arc length s as parameter is
r*(s)
=
r( ~)
=
[a cos
~
a sin
~
J.
•
Curves in Mechanics. Velocity. Acceleration
Curves playa basic role in mechanics, where they may serve as paths of moving bodies.
Then such a curve C should be represented by a parametric representation rV) with time
t as parameter. The tangent vector (7) of C is then called the velocity vector v because,
being tangent, it points in the instantaneous direction of motion and its length oives the
speed Ivl
= Ir'l =
~ = dsldt; see (2). The second derivative of r(t)~is c':uled the
SEC. 9.5
395
Curves. Arc Length. Curvature. Torsion
acceleration vector and is denoted by a. Its length lal i<; called the acceleration of the
motion. Thus
(16)
v(t)
=
r' (t),
aCt)
= v' (t) = r"(t).
Tangential and Normal Acceleration. Whereas the velocity vector is always tangent
to the path of motion, the acceleration vector will generally have another direction, so that
it will be of the form
(17)
where the tangential acceleration vector atan is tangent to the path (or, sometimes, 0)
and the normal acceleration vector a norm is normal (perpendicular) to the path (or,
sometimes, 0).
Expressions for the vectors in (17) are obtained from (16) by the chain rule. We first
have
dr
dr ds
dt
ds
vet) = -
-
dt
=
ds
u(s)-
dt
where u(s) is the unit tangent vector (4). Another differentiation gives
2
(18)
aCt)
= -dv = -d ( U(s) -ds ) = -du (ds)2
+ u(s) -d 2s
dt
dt
dt
ds
dt
dt
Since the tangent vector u(s) has constant length (one), its derivative du/ds is perpendicular
to u(s) (by Example 4 in Sec. 9.4). Hence the first term on the right of (18) is the normal
acceleration vector, and the second term on the right is the tangential acceleration vector,
so that (18) is of the form (17).
Now the length of a tan is the projection of a in the direction of v, given by (II) in
Sec. 9.2 with b = v; that is, latanl = a·v/lvl. Hence atan is this expression times the unit
vector (1IIvl)v in the direction of v; that is,
a·v
atan = - - v.
(18*)
Also.
v·V
a norm = a - a tan .
Let us consider two basic examples, involving centripetal and centrifugal accelerations
and Corio lis acceleration, as it occurs. for instance. in space travel.
E X AMP L E 7
Centripetal Acceleration. Centrifugal Force
The vector function
ret)
= [R cos wt.
R sin wtj
= R cos wt i +
R sin wt j
(Fig. 208)
(with fixed i and j) represents a circle C of radIUS R with center at the origm of the .\)"-plane and describes the
motion of a small body B counterclockwise around the circle. Differenriarion gives the velocity vector
v = r' = [- Rw sin wt.
v is tangent to C.
It~
magnitude, the speed.
Rw cos wt] = - Rw sin wt i
i~
Ivl = Ir'l = w-:-;:' = Rw.
T
Rw cm, wt .i
(Fig. 208).
396
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
y
x
Fig. 208.
Centripetal acceleration a
Hence it is constant. The speed divided by the distance R from the center is called the angular speed. It equals
w, so that it is constant. too. Differentiating the velocity vector, we obtatn the acceleration vector
(19)
a = v' = [-Rw2 cos wt,
-Rw2 sin wt] = -Rw2 cos wt i - Rw2 sin wt j.
Thi~ shows that a = _w2 r (Fig. 208). so that there is an acceleration IOwatd the center. called the centripetal
acceleration of the motion. It occurs because the velocity vector is changing direction at a constant rate. Its
magnitude is constant, lal = w2 1rl = w2 R. Multiplying a by the mass m of B, we get the centripetal force ma.
The oppo,ite vector -ilia is called the centrifugal force. At each instant these two forces are in equilibrium.
We see that in this motion the acceleration vector is normal (perpendicular) to C; hence there is no tangential
acceleration.
•
E X AMP L E 8
Superposition of Rotations. Coriolis Acceleration
A projectile IS moving with constant speed along a meridian of the rotating eatth in Fig. 209. Find its acceleration.
a
~
~"\
-----p
---
,
,
--\.
I',
1
J
1
Fig. 209.
Example 8. Superposition of two rotations
Solution.
Let x. y, :: be a tixed Cartesian coordinate system in space. with unit vectors i, j, k in the directiuns
of the axes. Let the earth, together with a unit vector b, be rotating about the z-axis with angulat· speed w > 0
(see Example 7). Since b is rotaing together with the earth. it is of the form
b(t) = cos wt i
+ sin WI j.
Let the projectile be moving on the meridian whose plane is spanned by band k (Fig. 209) with constant angular
speed y > O. Then its position vector in terms of band k IS
r(l) = R cos yt btl) + R sin yl k
(R = Radius of the earth).
SEC. 9.5
Curves. Arc Length. Curvature. Torsion
397
This is the modeL The rest is calculation. The result will be unexpected and highly relevant for air and space
travel. The first and second derivatives of b with respect to 1 are
b ' (tl = -w sin wt i + w cos wt j
(20)
The first and second derivatives of rtt) with re<;pect to tare
v = r'(t) = R cos ')'t b ' - ')'R sin
(21)
')'t
b + ')'R cos ')'t k
a = v' = R cos ')'t b" - 2')'R sin ')'t b ' - ')'2R cos ')'t b - ')'2R sin ')'t k
=
R cos
')'t
bIt - 2')'R
SIn ')'t
bI
-
')'2r.
By analogy with Example 7 and because of bIt = - w2 b in (20) we conclude that the first term in a (involving
win bIt!) is the centripetal acceleration due to the rotation of the earth. Similarly, the third term in the last line
(involving ')'!) i, the centripetal acceleratiun due to the motion of the projectile un the meridian M of the rotating
earth.
The second, unexpected term -2')'R sin ')'t b ' in a is called the Coriolis acceleration3 (Fig. 209) and is due
to the interaction of the two rotations. On the Northern Hemisphere, sin ')'t > 0 (for 1 > 0; also ')' > 0 by
assumption), so that a cor has the direction of - b I. that is, opposite to the rotation ufthe earth. lacorl is maximum
at the North Pole and zero at the equator. The projectile B of mass 1Il0 experiences a force -Illoa cor opposite
to 1I10 a co l"' which tends to let B deviate from M to the right (and in the Southern Hemisphere, where sin ')'1 <
O. to the left). This deviation has been observed for missiles. rockets. shells. and atmospheric air flow.
•
Curvature and Torsion. Optional
This optional portion of the section completes our discussion of curves from the viewpoint
of vector calculus.
The curvature K(S) of a curve C: rts) (s the arc length) at a point P of C measures the
rate of change lu' (s)1 of the unit tangent vector u(s) at P. Hence K(S) measures the deviation
of C at P from a straight line (its tangent at P). Since u(s) = r' (s). the definition is
(22)
K(S)
(' = d/ds).
= lu' (s)1 = Ir"(s)1
The torsion res) of C at P measures the rate of change of the osculating plane 0 (the
plane spanned by u and u'. see Fig. 210) of C at P. Hence res) measures the deviation
rn
E
5c
co
Rectifying plane
b
Normal plane
PrinCipal
-
p _n()fl7Jal
Osculating plane
Fig. 210.
Trihedron. Unit vectors u, p, b and planes
3GUSTAVE GASPARD CORIOLIS (1792-1843), French engineer who did research in mechanics.
CHAP. 9
398
Vector Differential Calculus. Grad, Div, Curl
of C at P from a plane (from 0 at Pl. Now the rate of change is also measured by the
derivative b' of a normal vector bat 0. By the definition of vector product, a unit normal
vector of 0 is b = u X (I/K)U' = U x p, where p = (IIK)U' is called the unit principal
normal vector and b is called the unit binormal vector of C at P; see Fig. 210. Here we
must assume that K =1= 0; hence K > O. The absolute value of the torsion is now defined by
(23*)
Whereas K(S) is nonnegative. It IS practical to give the torsion a sign. motivated by
"right-handed" and "left-handed" (see Figs. 202. 203). This needs a little further
calculation. Since b is a unit vector, it has constant length. Hence b' is perpendicular to
b (see Example 4 in Sec. 9.4). Now b' is also perpendicular to U because by the definition
of vector product we have bou = 0, bou' = O. This implies
(bou)' = 0;
that is,
b'ou
+ bou' = b ' °U + 0 = O.
"*
Hence if b'
0 at P, it must have the direction of p or -p, so that it must be of the form
b' = -7p. Taking the dot product of this by p and using pop = I gives
(23)
7(S)
= -p(s)ob'(s).
The minus sign is chosen to make the torsion of a right-handed helix positive and that of
a left-handed helix negative (Figs. 202, 203). The orthonormal vector triple u, p, b is
called the trihedron of C. Figure 210 also shows the names of the three straight lines in
the directions of u, p, b, which are the intersections of the osculating plane, the normal
plane. and the rectifying plane.
=
11-101
PARAMETRIC REPRESENTATIONS
Find a parametric representation of the following curves.
1. Circle of radius 3, center (4, 6)
2. Straight line through (5. 1.
2)
and (11, 3. 0)
15. [\!CoSt, Vsin t, oj ("Lame
16. [cosh t, sinh t. 0]
17. [t, lit, 0]
18. [1,5 + t, -5 + lit]
ClI/Te
n
)
3. Straight line through (2, O. 4) and (- 3. O. 9)
4. Straight line y
5. Circle
y2
6. Ellipse x 2
=
2x
+ 4y +
+ y2 =
+ 3, :;;
Z2
=
5, x
=
3
I. z = y
7. Straight line through ta, b, c) and (a
8. Intersection of x
19. Show that setting t = -t* reverses the orientation of
[a cos t. a sin t. 0].
= 7x
+ y - ::: =
= 1.::: = y
+ 3, b - 2. C + 5)
+z= 3
2, 1x - 5y
20. If we set t
Explain.
=
et in Prob. 12, do we get the entire line?
21. CAS PROJECT. Curves. Graph the following more
complicated curves.
9. Circle ~x2 + y2
10. Helix x 2 + y2 = 9, ::: = 4 arctan tylx)
(a) r(t) = [2 cos t + cos 2t,
(Steiner's hypocycloid)
111-181
(b) r(t) = [cos t + k cos 2t.
k = 10,2. 1,~, O. -~, -)
What curves are represented as tollows?
11. [2 + r cos 4t. 6 + r sin 4t, 2t]
12. [4 - 2t, 8t, -3 + 5t]
13. [2 + cos 3t, - 2
14. [t, t 2 , t 3 ]
+
sin 3t, 5]
(c) r(t) = [cos t.
(d) r(t) = [cos t,
closed?
2 sin t -
sin 2t]
sin t - k sin 2t] with
sin 5t] (a Lissajolls cline)
sin kt]. For what k's will it be
(e) r(t) = [R sin wt + wRt,
R cos wt + R] (cycloid).
SEC. 9.5
Curves. Arc Length. Curvature. Torsion
122-251
TANGENT
399
23. ret)
= [5 cos t,
5 sin t,
0],
24. ret) = [3 cos t,
3 sin t,
4t],
25. r(t)
= [cosh t,
CURVES IN MECHANICS
132-341
Given a curve C: r(t), find a tangent vector r' (t), a unit
tangent vector u' (t), and the tangent of C at P. Sketch the
curve and the tangent.
22. ret) = [t, t 2, 0], P: (2,4,0)
P: (4, 3, 0)
32. r(t)
33. ret)
P: (3, 0, 87T)
P: (~, ~)
sinh t],
Velocity and Acceleration. Forces on moving objects
(cars, airplanes, etc.) require that the engineer knows
corresponding tangential and normal accelerations. Find
them, along with the velocity and speed, for the following
motions. Sketch the path.
= [4t,
= [1.
-3t,
t
2
34. ret) = [cos t,
0]
0]
,
2 sin t,
0]
LENGTH
126-281
35. (Cycloid) Given
Find the length and sketch the curve.
26. Circular helix r(t)
= [2 cos t,
2 sin t,
6t] from
(2, 0, 0) to (2, 0, 247T)
27. Catenary ret) = [t,
28. Hypocycloid ret)
=
cosh t] from t = 0 to t = 1
la cos
29. Show that (10) implies €
3
=
3
a sin t]. total length
t.
I
b
~ cir for the
a
length of a plane curve C: y = f(x), z = 0, a
30. Polar coordinates p =
give€ =
I
13
Yr + y2,
~
x
~
b.
e = arctan (ylx)
Vp2 + p'2 de, where p' =
dplde. Derive
ex
this. Use it to find the total length of the cardioid
p = a(l - cos e). Sketch this curve. Hint. Use (10)
in App. 3.l.
31. CAS PROJECT. Polar Representations. Use your
4
p = ae
p=
p
p =
--
e
2
e
e
Cissoid of Diocles
+ b
Conchoid of Nic011ledes
Hyperbolic spiral
3a sin 2e
cos3
+
(R cos wt + R)j.
This cycloid is the path of a point on the rim of a wheel
of radius R that rolls without slipping along the x-axis.
Find v and a at the maximum y-values of the curve.
36. CAS
PROJECT. Paths of Motions. Gear
transmissions and other engineering constructions
often involve complicated paths whose study is greatly
facilitated by the use of a CAS. To grasp the idea, graph
the following paths and find the velocity, the speed,
and the tangential and normal accelerations.
(a) ret) = [2 cos t + cos 2t,
(Steiner's hypocycloid)
(b) ret) = [cos t
(c) ret) = [cos t,
+ cos 2t, sin t
sin 2t,
(d) r(t) = [ct cos t,
2 sin t - sin 2t]
cos 2t]
ct sin t,
sin 2t]
ct]
(c
* 0)
38. (Earth and moon) Find the centripetal acceleration of
cos
p = ale
wRt) i
Logarithmic spiral
2a sin
cos
+
(R sin wt
Spiral of Archimedes
be
a
=
=
37. (Sun and earth) Find the acceleration of the earth
toward the sun from (19) and the fact that the earth
revolves about the sun in a nearly circular orbit with
an almost constant speed of 30 kmIsec.
CAS to graph the following famous curves and
investigate their form depending on parameters a and b.
p = ae
r{t)
e + sin3 e
sin 3e
p = 2a - - sin 2e
p = 2a cos e + b
Folium of Descartes
Maclaurin's trisectrix
Pascal's snail
the moon toward the earth, assuming that the orbit
of the moon is a circle of radius 239,000 miles
= 3.85· 108 m, and the time for one complete
revolution is 27.3 days = 2.36· L06 sec.
39. (Satellite) Find the speed of an artificial earth satellite
traveling at an altitude of 80 miles above the earth's
surface, where g = 31 ft/sec 2 . (The radius of the earth
is 3960 miles.)
40. (Satellite) A satellite moves in a circular orbit
450 miles above the earth's surface and completes
I revolution in 100 min. Find the acceleration of
gravity at the orbit from these data and from the radius
of the earth (3960 miles).
4Named after ARCHIMEDES (c. 287-212 B.C.), DESCARTES (Sec. 9.1), DlOCLES (200 B.C.),
MACLAURIN (Sec. 15.4), NICOMEDES (250? B.C.) ETIENNE PASCAL (1588-1651), father of BLAISE
PASCAL (1623-1662).
CHAP. 9
400
141-501
Vector Differential Calculus. Grad, Div, Curl
CURVATURE AND TORSION
41. Show that a circle of radius a has curvature lIa.
42. Using (22), show that if C is represented by ret) with
arbitrary t, then
(23***)
VCr' r' )(r" r") - (r' r")2
0
0
45. Show that the torsion of a plane curve (with K > 0) is
identically zero.
46. Show that if C is represented by r(t) with arbitrary
parameter t. then. assuming K > 0 as before.
0
(r'
T(t)
r"
rIll)
= ---'-----'----
(r' or' )(r" or") - (r' or")2
(22*) K(t) = - ' - - - - - - - ' - - - - - - ' - (r' r')3/2
0
43. Using (22*), show that for a curve y = i{x) in the
xy-plane.
dr , etc. ) .
( y' = -'dx
(22**)
44. Using b = u x p and (23), show that
(23**)
T(S)
=
(u
p
p') = (r'
r"
r"')/K'(K
9.6
> 0).
47. Find the torsion of C: r(t) = [t. t 2 , t 3 ] (which looks
similar to the curve in Fig. 2ID).
48. (Helix) Show that the helix [Cl cos t. CI sin t, ctl can
be represented by [a cos (sIK), a sin (sIK), cslKl,
where K = VCl 2 + c2 and .I" is the arc length. Show
that it has constant curvature K = cdK2 and torsion
T= dK2.
49. Obtain K and Tin Prob. 48 from (22*) and (23***) and
the Oliginal representation in Prob. 48 with parameter t.
50. (Frenet5 formulas) Show that
u' = KP, p' = -KU + Tb, b' = -TP.
Calculus Review:
Functions of Several Variableso
Optional
Curves required vector functions of a single variable x or s, and we now proceed to
vector functions of several variables, beginning with a review from calculus. Go on to
the next section, consulting this material only when needed. (We include this short
section to keep the book reasonably ~elf-contained. For partial derivatives see
App. A3.2.)
Chain Rules
Figure 211 shows the notations in the following basic theorem.
"
1
D
~[X(U'V).Y(u.L').z(u.V)l
B
u
Fig. 211.
Notations in Theorem 1
5JEAN-FREDERIC FRENET (l816-1900), French mathematician.
SEC. 9.6
Calculus Review: Functions of Several Variables.
Optional
401
Chain Rule
THEOREM 1
Let w = f(x, )', z) be continuous and have continuous first partial derivarives in
a domain D in xy:;:-space. Let x = x(u, v), y = y(u, v), :;: = z(u, v) be funcTions
that are colltinuous and hal'e first partial derivatives in a domain B in the
uv-plane, where B is such that for every point (u, v) ill B, the corresponding point
Ix(u, v), y(u, v), :;:(ll, v)] lies in D. See Fig. 21l. Then the function
w
= f(x(u. v), y(u. v). z(u. v»
is defined in B, has first partial deril'Otil'es lI'ith respect to u and v in B, and
a"
aw
a-
aw
aw ax
aw
Au
Ax au
ay Au
az Au
aw
aw
ax
away
aw az
av
ax av
ay av
az av
-=--+--~-+-~
(1)
-=--+--+--
In this theorem, a domain D is an open connected point set in xyz-space, where "connected"
means that any two points of D can be joined by a broken line of finitely many linear
segments all of whose points belong to D. "Open" means that every point P of D has a
neighborhood (a little ball with center P) all of whose points belong to D. For example.
the interior of a cube or of an ellipsoid (the solid without the boundary surface) is a domain.
In calculus, x, y, Z are often called the intermediate variables, in contrast with the
independent variables u, v and the dependent variable w.
Special Cases of Practical Interest
If w = f(x, y) and x = x(u, v), y = y(u, v) as before, then (1) becomes
av
aw
aw ax
aw
au
ax au
ay au
aw
aw ax
away
av
ax av
ay av
-=--+----
(2)
-=--+--
If no = f(x, y, .:::) and x = xU), y = yet), .::: =
(3)
z(t),
then
dw
aw dx
aw
dt
ax dt
ay dt
dy
1I) gives
all'
az dt
If w = f(x, y) and x = x(t), y = y(t), then (3) reduces to
(4)
dz
-=--+--+--
dw
aw
dt
ax dt
dx
aw
dr
-=--+-_.-
ay dt·
402
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Finally, the simplest case w = f(x), x = x(t) gives
(5)
E X AMP L E 1
dw
dw dx
dt
dx dt
Chain Rule
If w = x 2
i
and we define polar coordinates r, 8 by x = r cos 8, y
=
r sin 8, then (2) gives
~
a;
=
2xcos 8 - 2ysin 8
aw
2x(-r sin 8) - 2y(r cos lJ) = -2r 2 cos 8 sin lJ - 2r2 sin lJcos 8 = -2r 2 sin 28.
a8
=
2
2
2rcos 8 - 2rsin 8 = 2rcos28
=
•
Mean Value Theorems
THEOREM 2
Mean Value Theorem
Let f(x, y, z) be continuous and have continuous first partial derivatives in a
domain D in xyz-space. Let Po: (xo, Yo, zo) and P: (xo + h, Yo + k, Zo + l) be
points in D such that the straight line segment PoP joining these points lies entirely
in D. Then
(6)
f(xo
+
h, Yo
+ k,
Zo
+
l) -
f(xo, Yo,
af
Zo)
= h-
ax
+
af
kay
+
af
l-,
az
the partial derivatives being evaluated at a suitable point of that segment.
Fig. 212.
Mean value theorem for a function of two variables [Formula (7)]
Special Cases
For a function f(x, y) of two variables (satisfying assumptions as in the theorem), formula
(6) reduces to (Fig. 212)
(7)
f(xo
+
h, Yo
+ k)
at
- f(xo, Yo) = h ax
+k
at ,
a;
SEC. 9.7
Gradient of a Scalar Field. Directional Derivative
403
and for a function f(x) of a single variable, (6) becomes
f(xo + 11) - f(xo)
(8)
=
df
11-,
dx
where in (8), the domain D is a segment of the x-axis and the derivative is taken at a
suitable point between Xo and Xo + h.
[1-51
DERIVATIVE
Find dwldt by (3) or (4). Check the result by substitution
and differentiation. (Show the details.)
1. w =
+ y2, X = e 2t , y = e- 2t
V:>?
= ylx, x = g(t), y = h(t)
3. w = xY, x = cosh t. y = sinh t
4. w = xy + yz + zx, x = 2 cos t, Y = 2 sin T, z = 5t
2. w
5. w = (x 2
+ y2 +
Z2)3,
X
= (2, Y = (4, Z = (2
/6-91
PARTIAL DERIVATIVES
Find iJwliJu and iJwlav by (1) and (2). Check the result by
substitution and differentiation. (Show the details.)
6. w = 4x 2 - 4y2, X = U + 2v, y = 2u - v
7. W =x 2y2.x= eUcosv.y = eUsinv
8. w =
9.7
X4 -
4x 2y2
= 1/(x 2
Z = 2uv
9. w
+ )'4, X =
uv, y
= ulv
+
y2
+
Z2), X = u 2
+
v 2, Y = u 2 - v 2,
10. (Partial derivatives on a surface) Let w = f(x, y, z),
and let z = g(x, y) represent a surface S in space. Then
on S, the function becomes
w(x, y)
=
f[x, y, g(x, y)].
Show that its partial derivatives are obtained from
aw
af
af
iJx
ax
az ax'
iJg
-=-+--
aw
af
af
ay
ay
az
ag
a)'
-=-+-[;: = g(x. y)].
Apply this to f = x 3 + )'3 + Z2, g = x 2 + y2 and
check by substitution and direct differentiation. (The
general formula will be needed in Sec. 10.9.)
Gradient of a Scalar Field.
Directional Derivative
We shall see that some of the vector fields in applications-not all of them!---can be
obtained from scalar fields. This is a considerable advantage because scalar fields can be
handled more easily. The relation between these two kinds of fields is obtained by the
"gradient," which is thus of great practical importance.
DEFINITION 1
Gradient
The gradient of a given scalar function f(x, y, z) is denoted by grad f or Vf (read
nabla f) and is the vector function defined by
ll)
gradf
at, at, -at] = at). + -.-J
at. +-k.
at
= Vt = [ ax ay az
ax
dy
az
Here x. y, z are Cartesian coordinates in a domain in 3-space in which f is defined
and differentiable. (For curvilinear coordinates see App. 3.4.)
CHAP. 9
404
Vector Differential Calculus. Grad, Div, Curl
For instance, if f(x, y, z) = 2)'3 + 4xz + 3x, then grad f = [4z + 3, 6)'2, 4x].
The notation \' f is suggested by the differential operator V (read nabla) defined by
V =
(1*)
a
-j
ax
a
a
ay
iJ;:.
+ - j + -k.
Gradients are useful in several ways, notably in giving the rate of change of f(x. y. ;:.)
in any direction in space, in obtaining surface normal vectors, and in deriving vector fields
from scalar fields, as we are going to show in this section.
Directional Derivative
From calculus we know that the partial derivatives in (1) give the rates of change of
f(x. y. z) in the directions of the three coordinate axes. It seems natural to extend this and
ask for the rate of change of f in an arbitrw:v direction in space. This leads to the following
concept.
Dr
INITION 2
Directional Derivative
The directional derivative Dbf or dflds of a function f(x, y, z) at a point P in the
direction of a vector b is defined by (see Fig. 213)
(2)
1_
df
. f(Q) - f(P)
Dbf = - = hm
.
s->O
S
ds
Here Q is a variable point on the straight line L in the direction of b, and lsi is the
distance between P and Q. Also, s > 0 if Q lies in the direction of b (as in
Fig. 213), s < 0 if Q lies in the direction of -b, and s = 0 if Q = P.
Fig. 213.
Directional derivative
The next idea is to use Cartesian .x),z-coordinates and for b a unit vector. Then the line L
is given by
(3)
res) = x(s)i
+
y(s)j
+
z(s)k = Po
+ sb
where Po the position vector of P. Equation (2) now shows that Dbf
dflds is the
derivative of the function f(x(s), yes), z(s)) with respect to the arc length s of L. Hence.
assuming that f has continuous partial derivatives and applying the chain rule [formula
(3) in the previous section], we obtain
(4)
df
af,
af,
af,
Dbf=-=-x + - y + - z
ds
ax
ay
az
SEC 9.7
405
Gradient of a Scalar Field. Directional Derivative
where primes denote derivatives with respect to s (which are taken at s = 0). But here,
differentiating (3) gives r' = x'i + y'j + z'k = b. Hence (4) is simply the inner product
of grad f and b [see (2), Sec. 9.2]; that is,
(5)
Dbf
ATTENTION!
ds
= b·grad f
(Ibl
= 1).
If the direction is given by a vector a of any length (oF 0), then
1
df
(5*)
E X AMP L E 1
df
=-
Daf
= -ds = -I
I a·gradf·
a
Gradient. Directional Derivative
Find the directional derivative of f(x. y, .:) = 2x2 + 3.1'2 + Z2 at P: (2, L 3) in the direction of a = [1, 0, -2].
Solution.
since
lal
=
grad J = [4x. fl\,. 2.:] gives at P the vector grad J(p) = [8. fl. 6]. From this and (5*) we obtain,
Vs.
DaJ(PI=
1
1
4
V5 [1.0.-2]"[8.6.61= Vs (8+0-12)=- Vs =-1.789.
The minus sign indicates that at P the function f
i~
decreasing in the direction of a.
•
Gradient Is a Vector. Maximum Increase
grad f in (I) looks like a vector-after all, it has three components! But to prove that it
actually is a vector. since it is defined in telms of components depending on the Cartesian
coordinates, we must show that grad f has a length amI direction independent of the choice
of those coordinates. In contrast, raflax, 2aflay, afli'J:;::] also looks like a vector but
does not have a length and direction independent of the choice of Cartesian coordinates.
Incidentally, the direction makes the gradient eminently useful: grad f points in the
direction of maximum increase of f.
Vector Character of Gradient. Maximum Increase
THEOREM 1
Let f(P) = f(x. y. :;::) be a scalar function having continuous first partial derivatives
in some domain B in space. Then grad f exists in B and is a vector, that is, its lellgth
and direction are independent of the particular choice of Cartesian coordinates. {f
grad f(P) oF 0 at some point P, it has the direction of maximum illcrease of f at P.
PROOF
From (5) and the definition of inner product [(1) in Sec. 9.2] we have
(6)
Dbf = Ibllgrad fl cos l' = Igrad fl cos l'
where l' is the angle between b and grad f. Now f is a scalar function. Hence its value
at a point P depends on P but not on the particular choice of coordinates. The same holds
for the arc length s of the line L in Fig. 213, hence also for Dbf. Now (6) shows that Dbf
is maximum when cos l' = \, l' = 0, and then Dbf = Igrad fl. It follows that the length
and direction of grad f are independent of the choice of coordinates. Since l' = 0 if and
only if b has the direction of grad f, the latter is the direction of maximum increase of
f at P, provided grad f oF 0 at P.
•
406
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Gradient as Surface Normal Vector
Gradients have an important application in connection with surlaces, namely, as surlace
normal vectors, as follows. Let S be a surlace represented by f(x, y, z) = C = COllst, where
f is differentiable. Such a surface is called a level surface of f, and for different c we get
different level surlaces. Now let C be a curve on S through a point P of S. As a curve in
space, C has a representation ret) = [x(t), yet), z(t)]. For C to lie on the surlace S, the
components of r(1) must satisfy f(x, y, z) = c, that is,
f(x(t), y(1), z(t» = c.
(7)
,
[ ,
, ']
Now a tangent vector of C .
IS r (1) = x (1), Y (f), z (f) . And the tangent vectors of all
curves on S passing through P will generally form a plane, called the tangent plane of S
at P. (Exceptions occur at edges or cusps of S, for instance, for the cone in Fig. 215 at
the apex.) The normal of this plane (the straight line through P perpendicular to the tangent
plane) is called the surface normal to S at P. A vector in the direction of the surface
normal is called a surface normal vector of Sat P. We can obtain such a vector quite
simply by differentiating (7) with respect to t. By the chain rule,
af,
af,
af,
ax
ay'
iJz
-x + -v + -z
=
,
o.
(gradf)or =
Hence grad f is orthogonal to all the vectors r' in the tangent plane, so that it is a normal
vector of Sat P. Our result is as follows (see Fig. 214).
grad~Tangent plane
f=
cons)
~
/p
/
Fig. 214.
THEOREM 2
Gradient as surface normal vector
Gradient as Surface Normal Vector
Let f be a differentiable scalar function ill space. Let f(x, y, z) = c = COllst represent
a surface S. Tlzell if tlze gradient of f at a poim P of 5 is /lOT the zero vector, if is
a normal vector of 5 at P.
E X AMP L E 2
Gradient as Surface Normal Vector. Cone
Find a unit nonnal vector n of the cone of revolution
Solution. The cone is the level surface I
grad I
n
=
~
[8x,
8y,
=
;:.2
0 of I(x,
= 4(x 2
+ y2) at the point P: (I, U, 2).
y,
4(x 2 + y2) - z2. Thus (Fig. 215),
- 22],
I
Igrad I(P)I grad I(P)
=
[
z) =
grad I(P)
=
[8,
2
V5'
0,
-
U,
-4]
I ]
V5
n points downward since it has a negalJve z-component. The other unit normal vector of the cone at P is -no •
SEC. 9.7
407
Gradient of a Scalar Field Directional Derivative
n/:
p
I
I
I
I
I
I
~
Fig. 215.
Cone and unit normal vector n
Vector Fields That Are Gradients of Scalar Fields
("Potentials")
At the beginning of this section we mentioned that some vector fields have the advantage
that they can be obtained from scalar fields, which can be handled more easily. Such a
vector field is given by a vector function yep), which is obtained as the gradient of a scalar
function. say, vW) = grad f(P). The function f(P) is called a potential function or a
potential of yep). Such a v{P) and the conesponding vector field are called conservative
because in such a vector field, energy is conserved; that is, no energy is lost (or gained)
in displacing a body (or a charge in the case of an electrical field) from a point P to another
point in the field and back to P. We show this in Sec. 10.2.
Conservative fields playa central role in physics and engineering. A basic application
concerns the gravitational force (see Example 3 in Sec. 9.4) and we show that it has a
potential which satisfies Laplace's equation. the most important partial differential
equation in physics and its applications.
THEOREM 3
Gravitational Field. Laplace's Equation
The force of attraction
(8)
p
=
c
-r
r3
=
_c[x - Xo
1'3
Y - Yo
.
r3
z - zoJ
.
r3
between two particles at points Po: (Xo, Yo, zo) and P: (x. y, z) (as given by Newton's
law of gravitation) has the potellfial f(x. y. z) = clr. where r (> 0) is the distance
between Po alld P.
TllllS P = grad f = grad (elr). This potential f is a solution o/Laplace's equation
(9)
[v 2 f (read nabla squared f) is called the Laplacian of f.]
CHAP. 9
408
PROOF
Vector Differential Calculus. Grad, Div, Curl
That distance is r = «x - XO)2 + (Y - .\'0)2 + (z - <:2)2)1/ 2 . The key observation now is
that for the components of p = [PI' P2. P3] we obtain by partial differentiation
x - Xo
(lOa)
and similarly
;" (~)
(lOb)
:<:
Y - Yo
----
r3
(~) =
z-
'::0
----
,-3
From this we see that, indeed. p is the gradient of the scalar function f = eI,-. The second
statement of the theorem follows by partially differentiating (10), that is.
a~2 (~)
--+
r3
a
iJy2
C)
--+
r3
:Z22
(~)
2
I
I
r
I
=
--+
r3
3(x - xO)2
r5
3(y - )'0)2
r5
3(.:: - ZO)2
,-5
and then adding these three expressions. Their common denominator is r5. Hence the three
terms -1/,-3 contribute - 3r 2 to the numerator, and the three other terms give the sum
so that the numerator is 0, and we obtain (9).
•
V2 f is also denoted by I:::.f. The differential operator
(11)
(read "nabla squared" or "delta") is called the Laplace operator. It can be shown that
the field of force produced by any distribution of masses is given by a vector function
that is the gradient of a scalar function f. and f satisfies (9) in any region that is free of
matter.
The great importance of the Laplace equation also results from the fact that there are
other laws in physics that are of the same form as Newton's law of gravitation. For instance,
in electrostatics the force of attraction (or repulsion) between two particles of opposite (or
SEC. 9.7
Gradient of a Scalar Field Directional Derivative
409
like) charge QI and Q2 is
k
(Coulomb's law6 )
p=-r
r3
(12)
Laplace's equation will be discussed in detail in Chaps. 12 and 18.
A method for finding out whether a given vector field has a potential will be explained
in Sec. 9.9.
11-61
CALCULATION OF GRADIENTS
Find Vf. Graph some level curves f = const. lndicate Vf
by arrows at some points of these curves.
2. f = x 2 + ty2
x
3. f = -
4.
I
=
X4
+ )'4
Y
(x - 2)(y
f =
6. f = (x 5.
17-121
3)2
+ 2)
+ Cr -
1)2
8.
9.
10.
11.
12.
= x2 +
)'2
= In (x 2
+ y2),
+
;::2,
P: (3, 2, 2)
P: (4. 3)
= cos x cosh y. P: (!7T. In 2)
= x 2 + 4y2 + 9;::2, P: (3, 2. I)
= eX sin y. P: (I. 7T)
= (x 2 + )'2 + Z2)-I/2, P: (2, 1,
[13-18]
-
y2,
P: (2, I)
\.
15. T = x 3
-
16. T
=
xl(x 2
17. T
=
3x 2 )'
=-- , P: t2, 2)
x
3X)'2,
P:
('VB, V2)
+ )'2), P: (4.0)
-
)'3.
1)2 - (y
= yl(x 2
+ )'2),
= x2
2x -
=
1)2. P: (4, - 3)
)'2,
P: t-2, 6)
In (x 2
= (x 2
=
-
+
P: (5, 3)
+ y2), P: (3, 3)
+ y2 + ~2)-1/2, P:
x 2y -
(12,0, 16)
h 3 , P: (2, 3)
25. (Gradient) What does it mean if Igrad I(p)1 < Igrad I( QJI
at two points P and Q in a scalar field?
(Landscape) If ;::(x. yl = 2000 - 4x 2 - y2 [meters]
gi ves the elevation of a mountain above sea level. what
is the direction of steepest ascent at P: (3, -6)? What
does the mountain look like?
P: (4, -2)
18. T = sin x cosh y. P: (~7T. In 5)
SURFACE NORMAL
Find a normal vector of the surface at the given point P.
2)
HEAT FLOW
14. T = arctan
24.
= (x -
~7-321
Experiments show that in a temperature field, heat flows in
the direction of maximum decrease of temperature T. Find
this direction in general and at a given point P. Sketch that
direction at P as an arrow.
13. T = x 2
21.
I
I
I
f
f
I
26.
v(P).
f
f
f
f
f
f
19.
20.
23.
USE OF GRADIENTS. VELOCITY FIELDS
ELECTRIC FORCE
The force in an electrostatic field I(x, y, z) has the direction
of the gradient of f. Find VI and its value at P.
22.
Given the velocity potential f of a flow. find the velocity
v = vI of the flow and its value at P. Make a sketch of
7.
119-241
29. x 2
+ by + cz = d. any P
+ 3y2 + ;::2 = 28, P: (4, 1. 3)
+ y2 = 25, P: (4, 3, 8)
30. x 2
-
27. ax
28. x 2
31.
X4
y2
+
4;::2 = 67. P: (-2. 1, 4)
+ y4 + Z4 =
+ y2, P:
32. z = x 2
133-381
243, P: (3, 3, 3)
(3, 4. 25)
DIRECTIONAL DERIVATIVE
Find the directional derivative of
of a.
I at
P in the direction
I = x 2 + )'2 - z, P: 0, l. -2). a = [I, 1. 2]
34. I = x 2 +)'2 + .;:2. P: (2, -2, 1), a = [-1, -1. 0]
35. I = xy.;:, P: (-I, 1,3), a = [I, -2.2]
33.
6CHARLES AUGUSTIN DE COULOMB (1736--1806), French phYSicist and engineer. Coulomb's law was
derived by him from his own very precise measurements.
CHAP. 9
410
36. f = (x 2 +
y2
+ :;,2)-112, P:
Vector Differential Calculus. Grad, Div, Curl
(4, 2, -4), a
each of them two examples showing when they are
advantageous.
= [1,2, -2]
37. f = eX sin y, P: (2, ~'7T, 0), a = [2, 3, 0]
= 4x 2 + y2 + 9:;,2, P: (2.4. 0). a = [-2. -4, 3]
38. f
v(fg) = fvg
v(f") =
POTENTIALS
for a given vector field-if they exist!--can be obtained by
a method to be discussed in Sec. 9.9. In simpler cases. use
inspection. Find a potential f = grad v for given v(x, y, ;:).
39. v = [3x, 5y, -4z]
40. v
41. v
= [ye X, eX,
=
nf"-lvf
v{flg) = (Ilg2)(gY'f - f'\g)
V2(fg) = gV 2f
+
2vf o vg
+
fY'2g
43. CAS PROJECT. Equipotential Curves. Graph some
isotherms (curves of constant temperature) and
indicate directions of heat flow by arrows when the
temperature T(x. y) equals:
2;:J
[4x 3 • 3y2, -6;:]
42. Project. Useful Formulas for Gradients and
Laplacians. Prove the following formulas and give for
9.8
+ gY'f
(a) x 3
(b) sin x sinh y
3.\),2
-
(c) eX sin y.
Divergence of a Vector Field
Vector calculus owes much of its importance in engineering and physics to the gradient,
divergence, and curL From a scalar field we can obtain a vector field by the gradient
(Sec. 9.7). Conversely, from a vector field we can obtain a scalar field by the divergence,
or another vector field by the curl (to be discussed in Sec. 9.9). These concepts were
suggested by basic physical applications, as we shall see.
To begin, let Vlx, y, z) be a differentiable vector function, where x, y, z are Cartesian
coordinates, and let vI> V2, V3 be the components of v. Then the function
.
(1)
dlV
v
aVl
aV2
aV3
ax
ay
az
= -- + -- + --
is called the divergence of v or the divergence of the vector field defined by v. For
example. if
v = [3xz , 2n', _)'Z2] = 3x.:i
~
+ 2.ni - r.:::2k
.... d
~
then
,
div v
= 3z + 2x - 2yz.
Another common notation for the divergence is
div v
[a
a] •
= V· v = - . -a . ax
ay
az
[Vb V2' V3]
with [he understanding [hat the "product" (alax)v 1 in the dot product means the partial
derivative av1lax. etc. Thi~ is a convenient notation, but nothing more. Note that V· v
means the scalar div v, whereas V! means the vector grad! defined in Sec. 9.7.
SEC. 9.8
411
Divergence of a Vector Field
In Example 2 we shall see that the divergence has an important physical meaning.
Clearly, the values of a function that characterizes a physical or geometric property must
be independent of the particular choice of coordinates: that is, those values must be
invariant with respect to coordinate transformations. Accordingly, the following theorem
should hold.
THEOREM 1
Invariance of the Divergence
The divergence div v is a scalar jimctioll. that is, its mlues depend only on the
points ill space (and. of course, on v) bllt not on the choice of the coordinates in
(I). sO that with respect to other Cartesian coordinates x*, y*, z* and corre~ponding
components Vi *, V2*' V3* of v,
(2)
We shall prove this theorem in Sec. 10.7, using integrals.
Presently, let us mm [0 the more immediare practical task of gaining a feel for the
significance of the divergence as follows. Let f(x, y, z) be a twice differentiable scalar
function. Then its gradient exists,
v
af, -.af , -af] = -.-1
af. + -at.J + -at k
= grad t = [ ax iI) az
ilx
a)'
az
and we can differentiate once more, the first component with respect to x, the second with
respect to y. the third with respect to z, and then form the divergence,
Hence we have the basic result thal the divergence of the gradient is the Laplacian
(Sec. 9.7).
(3)
E X AMP L E 1
div (grad f)
=
",2t.
Gravitational Force. Laplace's Equation
The gravitational force p in Theorem 3 of the last section is the gradient of the scalar function f(x, y, z) = clr,
which satisfies Laplaces equation V2 f = U. According to (3) this implies that div p = 0 (r > 0).
•
The following example from hydrodynamics shows the physical significance of the
divergence of a vector field. (More physical details on this significance will be added in
Sec. 10.8.)
412
E X AMP L E 2
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
Flow of a Compressible Fluid. Physical Meaning of the Divergence
We consider the motion of a fluid in a region R having no sources or sinks in R, that is, no points at which
fluid is produced or disappears. The concept of fluid state is meant to cover also gases and vapors. Fluids in
the restricted sense, or liquids l water or oil, for instance), have very small compressibility, which can be neglected
in many problems. Gases and vapors have large compressibility; that is, their density p (= mass per unit volume)
depends on the coordinates x, y, z in space (and may depend on time t). We assume that our t1uid is compressible.
We consider the flow through a rectangular box B of small edges ax. /:J.y . ..k parallel to the coordinate axes
(Fig. 216), (/:J. is a standard notation for small quantities; of course, it ha;, nothing to do with the notation for the
Laplacian in (11) of Sec. 9.7.) The box B has the volume.1V = !:J.x /:J.y.1z. Let v = [VI, V2, V3] = VIi + V2j + V3k
be the velocity vector of the motion. We set
(4)
and assume that u and v are continuously differentiable vector functions of x, y, z, and t (that is, they have first
partial derivatives which are continuous). Let us calculate the change in the mass included in B by considering
the flux across the boundary, that is_ the lotal loss of mass leaving B per unit time. Consider the flow through
the left of the three faces of B that are visible in Fig_ 216, whose area is .1x j,z.. Since the vectors VI i and V3 k
are parallel to that face, the components VI and V3 of v contribute nothing to this flow. Hence the mass of fluid
entering through that face during a short time interval 0.t is given approximately by
where the subscnpt y indicates that this expre%ion refers to the left face_ The mass of fluid leaving the box
B through the opposite face during the same time interval is approximately (U2)y+.'l.Y /:J.x /:J.z /:J.t_ where the
subscript y + ~y indicates that this expression refers to the right face (which is not visible in Fig. 216)_ The
difference
is the approximate loss of mass. Two similar expressions are obtained by considering the other two pairs of
parallel faces of B.. If we add these three expressions, we find that the total loss of mass in B during the time
interval /:J.l is approximately
where
and
This loss of mass in B is caused by the time rate of change of the density and is thus equal to
up
~-Ll.VLl.l.
at
Box B
!1X
Fig. 216.
Physical interpretation of the divergence
SEC 9.8
413
Divergence of a Vector Field
If we equate both expressions. divide the resulting equation by
~V
::J.t, and let ..h.
~Y •
.1::. and .it approach zero.
then we obtain
di, u
=
div (pv)
fJp
ill
= -
or
ap
a,
+ div (pv) =
(5)
O.
This important relation is called the condition for the collsermtiolJ of lIIasS or the continuity equation of a
cOlllpre.u1bie fluid flow.
If the flow is steady, that b. independent of time. then aplat = 0 and the continuity eljuation is
div (pv)
(6)
=
o.
If the density p is constant. so that the t1uid is incompressible, then equation (6) becomes
divv
(7)
O.
=
This relation is known as the condition of incompressibility. It expresses the fact that the balance of outtlow
and inflow for a given volume element is zero at any time. Clearly. the assumption that the tlow has no sources
or sinks in R is essential to our argument.
From this discussion you should conclude and remember that. roughly speaking. tile dh'ergellce measures
outflow millus ;'l!1oW.
•
Comment. The divergence theorem of Gauss, an integral theorem involving the
divergence, follows in the next chapter (Sec. 10.7).
P R Olil;£M -S E~~
lf7]
CALCULATION OF THE DIVERGENCE
Find the divergence of the following vector functions.
1. [x 3
+ y3, 3xy2, 3<:.\·2]
[e 2x cos 2.\". e2x sin 2y. 5e 2z ]
3. [x 2 + y2, 2~yz, Z2 + x 2]
4. (x 2 + y2 + ::2)-3/2rx, v, zl
2.
5. [sin xy.
6.
7.
[VI(Y, z),
X 2 y 2 Z 2[X,
sin xy,
Z co~
V2(Z, x),
y.
xyl
v 3 (x,
y)l
zl
= [x,
y. V 3 ]. Find a V3 such that (a) div v > 0
everywhere. (b) div v > 0 if Izl < I and div v < 0 if
1:::1 > l.
9. (Incompressible flow) Show that the flow with
velocity vector v = yi is incompressible. Show that the
particles that at time t = 0 are in the cube whose faces
are portions of the planes x = 0, x = I, y = O. Y = I,
Z = 0, Z = I occupy at t = I the volume 1.
8. Let v
10. (Compressible flow) Consider the flow with velOCIty
vector v = xi. Show thm the individual particles have
the position vectors r( t) = C I e t i + c 2j + C3k with
constant C1 , ('2, ('3' Show that the particles that at I = 0
are in the cube of Prob. 9 at t = I occupy the volume e.
11. (Rotational flow) The velocity vector vex, y. <:) of an
incompressible fluid rotating in a cylindrical vessel is of
the form v = w X r, where w is the (constant) rotation
vector; see Example 5 in Sec. 9.3. Show that div v = O.
Is this plausible because of our present Example 27
12. CAS PROJECT. Visualizing the Divergence. Graph
the given velocity field v of a fluid flow in a square
centered at the origin with sides parallel to the coordinate
axes. Recall that the divergence measures outflow minus
inflow. By looking at the flow near the sides of the square,
can you see whether div v must be positive or negative
or may perhaps be zero? Then calculate div v. First do
the given flows and then do some of your own. Enjoy it.
(a) v = i
(b) v = xi
(c) v
=
xi - yj
(d) v = xi
+ yj
(e) v = - ri - yj
(0 v
= (x 2
+ y2)-I(_yi + xj)
CHAP. 9
414
Vector Differential Calculus. Grad, Div, Curl
~4.=-20 I
13. PROJECT. Useful Formulas for the Divergence. Prove
(a) div (kv) = k div v (k constant)
(b) div(fv) = fdi\'v + vo"\f
(c) div (f\g) = f\2g + '\fo'\g
(d) div (f'\ g) - div (gV f) = fV 2g - g,\2f.
CALCULATION OF THE LAPLACIAN BY (3)
Find "\2f by (3). Check by ditlerentiation. Indicate when
(3) is simpler. (Show the details of your work.)
14.
15.
Verify (b) for f = e
and v = ad + byj + c:::k.
Obtain the answer to Prob. 4 from (b). Verify (c) for
f = x 2 - y2 and g = eX + Y . Give examples of your own
for which (a)-(d) are advantageous.
X1JZ
9.9
f
f
= xyl:::2
= (y
+ x)/(y
- x)
f =::: - 4Vx + )'2
18. f = arctan (ylx)
20. f = cos 2 X - sin2 )'
2
16.
17. f
19.
f
= ~2_y2
=
cos 2xy
eXYz
Curl of a Vector Field
Gradient (Sec. 9.7), divergence (Sec. 9.8), and curl are basic in connection with fields,
and we now define and discuss the curl.
Let vex, y, z) = [Vb V2' V3] = VIi + V2j + vsk be a differentiable vector function of
the Cartesian coordinates x, y, z. Then the curl of the vector fUllction v or of the vector
field gil'en by v is defined by the "symbolic" determinant
curl v
= "\ x v =
j
k
a
a
a
ax
ay
az
VI
V2
V3
(1)
avs _ aV2)i + (aVI _ avs)j + (aV2 _ aVl)k.
( ay
a::;
az
ax
ax
ay
This is the formula when x. J, z are right-handed. If they are left-handed. the determinant
has a minus sign in front (just as in (2**) in Sec. 9.3).
Instead of curl v one also uses the notation rot v (suggested by "rotation"; see Example 2).
E X AMP L E 1
Curl of a Vector Function
Let v = [yz.
zl =
3;:x.
yzi
+ 3zxj + zk with
right-handed x, y, z. Then (1) gives
k
curl,
=
alax
alay
iJIiJ~
=
-3xi
+ yj + (3;:
3::x
- :)k = -3d
+ yj +
2zk.
•
The curl plays an important role in many applications. Let us illustrate this with a typical
basic example. More about the nature and significance of the curl will be said in
Sec. 10.9.
E X AMP L E 2
Rotation of a Rigid Body. Relation to the Curl
We have seen in Example 5. Sec. 9.3, thar a rotation of a rigid body B about a fixed axis in space can be
described by a vector w of magnitude w in the direction of the axis of rotation, where w (> 0) is the angular
speed of the rotation, and w is directed so that the rotation appears clockwise if we look in the direction of w.
According to (9), Sec. 9.3, the velocity field of the rotation can be represented in the form
v = w
X
r
SEC. 9.9
415
Curl of a Vector Field
where r is the position vector of a moving point with respect to a Cartesian coordinate system harillg the origill
on the axis of rotation. Let us choose right-handed Cartesian coordinates such that the axis of rotation is the
::-axis. Then (see Example 2 in Sec. 9.4)
w = [0.
0,
wI
v = w X r = [-iVY,
= ivk,
WX,
0] = -Wl"i
+
ivXj.
Hence
k
j
curl v =
This
THEOREM 1
prove~
iJ
a
a
ax
ay
iJ::
-wy
WX
o
= [0, O.
2wJ = 2wk = 2w.
•
the following theorem.
Rotating Body and Curl
The curl of the velocity field of a mtating rigid hody has the direction of the axis
of the rotation, and its magnitude equals twice the angular ~peed of the rotation
The following two relations among grad, div, and curl are basic and shed further light on
the nature of the curl.
THEOREM 2
Grad. Div, Curl
Gradient fields are irrotational. That is, if a continllol/sly d{fferentiable vector
function is the gradient of a scalar function f, then its cllrl is the zero vector,
(2)
curl (grad f)
=
O.
Furfhel71lOre, the divergence of the cllrl of a t'rvice continllously dijferentiable vector
function v is :ero,
(3)
PROOF
E X AMP L E 3
div (curl
v)
= O.
Both (2) and (3) follow directly from the definitions by straightforward calculation. In the
proof of (3) the six terms cancel in pairs.
•
Rotational and Irrotational Fields
The field in Example 2 is not motatlOnal. A similar velocity field is obtained by stirring tea or coffee in a cur
The gravitational field in Theorem 3 of Sec. 9.7 has curl p = O. It is an irrotational gradient field.
•
The term "irrotationar' for curl v = 0 is suggested by the use of the curl for characterizing
the rotation in a field. If a gradient field occurs elsewhere, not as a velocity field, it is
usually called conservative (see Sec. 9.7). Relation (3) is plausible because of the
interpretation of the curl as a rotation and of the divergence as a flux (see Example 2 in
Sec. 9.8).
Finally, since the curl is defined in terms of coordinates. we should do what we did for
the gradient in Sec. 9.7, namely, to find out whether the curl is a vector. This is true, as
follows.
CHAP. 9
416
THEOREM 3
Vector Differential Calculus. Grad, Div, Curl
Invariance of the Curl
curl v is a vector. That is, it has a length and direction that are independent of the
particular choice of a Cartesian coordinate system in space. (Proof in App. 4.)
11-61
CALCULATION OF CURL
Find curl v for v given with respect to right-handed
Cartesian coordinates. Show the details of your work.
2
2x ,
2. [yn, z n ,
0]
1. [yo
xn]
eX siny,
3. [ex cos y,
4. (x 2
+
(n
+
y2
5. [In (x 2
+
6. [sin y.
cos
> 0, integer)
0]
Z2)-3/2[X,
y,
y2), 2 arctan (y/x),
Z,
z]
0]
-tan x]
7. What direction does curl v have if v is a vector parallel
to the xz-plane?
S. Prove Theorem 2. Give two examples for (2) and (3)
each.
19-141
FLUID FLOW
Let v be the velocity vector of a steady fluid flow. Is the
flow irrotational? Incompressible? Find the streamlines
(the paths of the particles). Hint. See the answers to Probs.
9 and 11 for a determination of a path.
9. v
=
[0,
10. v
=
[_y2,
11. v
=
[y,
12. v
=
[csc x,
0]
Z2,
4,
-x,
13. v
L4. v
= [x,
= [y3,
L5. WRITING PROJECT. Summary on Grad, Div,
Curl. List the definition and most important facts and
formulas for grad, div, curl, and '17 2 • Use your list to
write a corresponding essay of 3-4 pages. Include
typical examples of your own.
L6. PROJECT. Useful Formulas for the Curl. Assuming
sufficient differentiability, show that
(a) curl (u
+
v) = curl u
+ curl v
(b) div (curl v) = 0
(c) curl (fv) = (grad f) x v
+
f curl v
(d) curl (grad f) = 0
(e) div (u x v)
117-~
=
v-curl u - u-curl v.
EXPRESSIONS INVOLVING THE CURL
With respect to right-handed coordinates, let
u = [y2, .;:2, x 2], v = [YZ, ;:x, .\)'], f = xyz, and
g = x + Y + z. Find the following expressions. If one of
the formulas in Project 16 applies. use it to check your
result. (Show the details of your work.)
17. curl v, curl (fv), curl (gv)
0]
LS. curl (fu), curl (gu)
L9. u x curl v, v x curl v, u-curl v, v-curl u
0]
sec x,
0]
.•
20. curl (u x v), curl (v x u)
TIONS AND PROBLEMS
1. Why did we discuss vectors in R2 and ~ in a separate
chapter, in addition to Chap. 7 on R n ?
6. Explain "right-handed coordinates," "orthonormal basis,"
"tangential acceleration."
2. What are applications that motivate inner products,
vector products, scalar triple products?
7. What is the definition of the divergence? Its physical
meaning? Its relation to the Laplacian?
3. What is wrong with the expression a x b x c? With
a-b-c? With (a-b) x c?
4. What are scalar fields? Vector fields? Potentials? Give
examples.
8. Granted sufficient differentiability of a scalar function
f and a vector function v, which of the following make
sense? gradf, f gradf, v gradf, v-gradf, divf,
div v, div (fV), curl (fv), curl f, .f curl v, v curl f.
5. What is the gradient? How is it related to directional
derivatives?
9. If ret) represents a motion, what is r' (t), Ir' (01, r"(t),
Ir"(t)I?
417
Summary of Chapter 9
10. How do you express the resultant of forces, the moment
of a force, and the work done by a force in terms of
vectors?
L1-201
VECTOR ADDITION,
SCALAR MULTIPLICATION, PRODUCTS
In right-handed coordinates let a = [3, 2, 7],
b = [6, 5, -4], c = [1, 8, 0], d = [9, -2,
Find
11. 4a + b - c - 2d
0].
3a· 4a, I 2a • a, I21a1 2 , Ibl 2
2c x 5d, 10c x d
(a x b)·c, a·(b x c), (a b
20. Iial - Ibll, la
+
c)
b)
bl, lal
31. (Moment) In what cases is the moment of a force p "" 0
zero?
32. (Velocity, acceleration) Find the velocity, speed, and
acceleration of the motion given by
at the point P: [5/'\1'2,
curve is the path?
(a x b) x c, a x (b x c)
18. llllal)a, (lIlcl)c
19. (a b d), (d a
30. (Moment) Find the moment vector m of p = [4, 2, 01
about P: (5, 1, 0) if p acts on a line through (1, 4, 0).
Make a sketch.
ret) = [5 cos t,
12. a·b, a·c, a x c
13. b x b, a x b, b x a
14.
15.
16.
17.
29. (Component) When is the component of a in the
direction of b negative? Zero?
+ Ibl
21. (Angle) Find the angle between a and b. Between c and
d. Sketch c and d.
22. (Angle) Find the angle between the planes
4x + 3y - z = 2 and x + y + Z = 1.
23. In what case is u x v = v x u? u·v = v·u·!
24. (Resultant) Find u such that a, b, c, d above, and u are
in equilibrium.
25. (Resultant) Find the most general v such that the resultant
of a, b, c, d above, and v is parallel to the .1y-plane.
26. (Work) Find the work done by q = [5, 1, 0] in the
displacement from (4, 4, 0) to (6, -1, 0).
27. (Component) Find thecomponentofu= [-1, 5, 0]
in the direction of v = [3, 4, 0).
28. (Component) In what cases is the component of a in
the direction of b equal to the component of b in the
direction of a?
sin t,
2t]
1/'\1'2, 7T121 What kind of
33. (Tetrahedron) Find the volume of the tetrahedron with
vertices (0, 0, 0), (I, 2, 0), (3, -3,0), (I, 1,5).
34. (Plane) Find an equation of the plane through (1, 0, 2),
(2, 3, 5), (3, 5, 7).
35. (Linear dependence) Are [2, -1, 3], [4, 2, -5],
[-1, 6, 0] linearly dependent? (Give reason.)
136-451
GRAD, DIV, CURL, V 2 ,
DIRECTIONAL DERIVATIVE
Let f = zy
Find
+
36. grad f and
37. (grad f)
X
yx, v = [y, z, 4~ - x], w =
b· 2 , Z2, x 2 ].
f grad f at (3, 4, 0)
grad f, (grad f). grad f
38. div v, div w
39. curl v, curl w
40. curl (grad f), div (grad f), div v
41. V2(f), V2(f2)
42. Dwf at (1. 2, 0)
43. Dvf at (3, 7, 5)
44. div (v x w)
45. curl (v x w)
+
curl (w x v)
Vector Differential Calculus. Grad, Div, Curl
All vectors of the form a = [aI' (/2, (13] = (IIi + a2j
vector space R3 with componentwise vector addition
and componentwise scalar multiplication
(2)
(c
+
(/3k
constitute the real
a scalar, a real number)
(Sec. 9,1).
418
CHAP. 9
Vector Differential Calculus. Grad, Div, Curl
For instance, the resultant of forces a and b is the sum a + b.
The inner product or dot product of two vectors is defmed by
(Sec. 9.2)
(3)
where'}' is the angle between a and b. This gives for the norm or length lal of a
(4)
as well as a formula for '}'. If a- b = O. we call a and b orthogonal. The dot product
is suggested by the work W = p - d done by a force p in a displacement d.
The vector product or cross product v = a x b is a vector of length
(5)
la x
hi
=
(Sec. 9.3)
lallbl sin '}'
and perpendicular to both a and b such that a, b, v form a right-handed triple. In
terms of components with respect to right-handed coordinates,
j
k
(Sec. 9.3).
(6)
The vector product is suggested, for instance, by moments of forces or by rotations.
CAUTION! This multiplication is anticommutative, a x b = -b x a, and is not
associative.
An (oblique) box with edges a, b, c has volume equal to the absolute value of
the scalar triple product
(7)
(a
b
c)
= a-(b x c) = (a x b)-c.
Sections 9.4-9.9 extend differential calculus to vector functions
and to vector functions of more than one variable (see below). The derivative of
v(t) is
(8)
,dv
v(t
v = - = lim
dt
.It_O
+ !1t)
!1t
- v(t)
=
['
"]
Vb V2, V3
=
,.
VII
+
,.
V2J
+
'k
V3
•
Differentiation rules are as in calculus. They imply (Sec. 9.4)
(u-v)' = u' -v
+
u-v',
(u x v)'
= u' x v + u
X
v'.
Curves C in space represented by the position vector r(t) have r' (t) as a tangent
vector (the velocity in mechanics when t is time), r' (s) (s arc length, Sec. 9.5) as the
unit tangent vector, and Ir"(s)/ = K as the curvature (the acceleration in mechanics).
419
Summary of Chapter 9
Vector functions vex. y. z) = [UI(X. y. z), U2(x, y. z), U3(X, y, z)] represent vector
fields in space. Partial derivatives with respect to the Cartesian coordinates x. y. Z
are obtained componentwise. for instance,
~ -_ [aUI
~
aU2
aU3J
fu'~'ili
The gradient of a scalar function
~
fu
~
(Sec. 9.6).
is
af]
= V f = [ -af , -af , -.ax aJ Clz
grad f
(9)
f
aU2 •
=aU-Il.+ J +aU3
-k
(Sec. 9.7).
The directional derivative of f in the direction of a vector a is
df
I
D f = = -a-vf
(10)
(Sec. 9.7).
lal
ds
a
The divergence of a vector function v is
.
dlv
(11)
aU
ax
aU2
ay
aU3
az'
= - -I + - - +--
v = v-v
(Sec. 9.8).
The curl of v is
k
j
(12)
curl v = \" x v =
a
a
iJ
ax
ay
az
UI
U2
u3
(Sec. 9.9)
or minus the determinant if the coordinates are left-handed.
Some basic formulas for grad, div. curl are (Secs. 9.7-9.9)
Wfg)
= fvg + gVf
(13)
v(f/g)
O/g2)(gVf - fVg)
=
div(fv)
= fdivv + v-vf
div (fVg)
= fv 2g + vf-vg
(14)
v2f
(15)
= div (\"f)
\"2(fg) = gv 2f
curl (fV)
+
2vf-vg
+
fv 2g
= V f x v + f curl v
(16)
div (u x v)
= v-curl u - u-curl v
curl (V'f)
(17)
div (curl v)
=
0
= o.
For grad, diy, curl, and v 2 in curvilinear coordinates see App. A3.4.
CHAPTER
.,
-,
10
Vector Integral Calculus.
Integral Theorems
This chapter is the companion to Chap. 9. Whereas the previous chapter dealt with
differentiation in vector calculus, this chapter concerns integration. This vector integral
calculus extends integrals as known from calculus to integrals over curves ("line
integrals"). surfaces ("surface integrals"). and solids. We shall see that these integrals have
basic engineering applications in solid mechanics, in fluid flow. and in heat problems.
These different kinds of integrals can be transformed into one another. This is done to
simplify evaluations or to gain useful general formulas, for instance, in potential theory
(see Sec. 10.8). Such transformations are done by the powerful formulas of Green (line
integrals into double integrals or conversely, Sec. 10.4), Gauss (surface integrals into triple
integrals or conversely. Sec. 10.7), and Stokes (line integrals into surface integrals or
conversely, Sec. 10.9).
The root of these transformations was largely physical intuition. The corresponding
formulas involve the divergence and the curl and will thus lead to a deeper physical
understanding of these two operations.
Prerequisite: Elementary integral calculus, Sees. 9.7-9.9
Sections that may be omitted in a shorter course: 10.3. 10.5. 10.8
References and Answers to Problems: App. I Part B. App. 2
10.1
Line Integrals
The concept of a line integral is a simple and natural generalization of a definite integral
J
b
(1)
f(x) dx
a
known from calculus. [n (I) we integrate the integrand f(x) from x = a along the x-axis
to x = b. [n a line integral we shall integrate a given function, also called the integrand,
along a curve C in space (or in the plane). Hence curve integral would be a better name,
but line integral is standard.
We represent the curve C by a parametric representation (as in Sec. 9.5)
(2)
420
ret) = [x(t), yet), z(t)] = x(t)i
+ y(t)j + z(t)k
(a ~ t ~ b).
SEC. 10.1
Line Integrals
421
)
B
(C
(a)
(b)
Fig. 217.
Oriented curve
The curve C is called the path of integration, A: rea) its initial point, and B: reb) its
terminal point. C is now oriented. The direction from A to B, in which t increases, is called
the positive direction On C and can be marked by an arrow (as in Fig. 217a). The points
A and B may coincide (as in Fig. 217b). Then C is called a closed path.
C is called a smooth curve if it has at each point a unique tangent whose direction varies
continuously as we move along C. Technically: r(t) in (2) is differentiable and the derivative
r' (t) = drldt is continuous and different from the zero vector at every point of C.
General Assumption
In this book, every path of integration of a line integral is assumed to be piecewise smooth;
that is, it consists of finitely many smooth curves.
For example, the boundary curve of a square is piecewise smooth, consisting of four
smooth curves (segments, the four sides).
Definition and Evaluation of Line Integrals
A line integral of a vector function F(r) over a curve C: r(t) [as in (2)] is defined by
JF(r)edr J
b
(3)
C
r
F(r(t)er'(t)dt
=
a
(see Sec. 9.2 for the dot product). In terms of components, with dr
in Sec. 9.5 and ' = dldt, formula (3) becomes
=
[dx,
dy,
,
dr
dt
dz} as
JF(r)edr = J(FI dx + F2 dy + F3 dz)
C
C
(3')
b
=
J
(FIX'
+
F 2 y'
+
F 3 z') dt.
a
If the path of integration C in (3) is a closed curve, then instead of
we also write
f·
c
Note that the integrand in (3) is a scalar, not a vector, because we take the dot product.
Indeed, Fer'/lr'l is the tangential component of F. (For "component" see (11) in Sec. 9.2.)
CHAP. 10
422
Vector Integral Calculus. Integral Theorems
We see that the integral in (3) on the right is a definite integral of a function of t taken
over the interval a ~ t ~ b on the t-axis in the positive direction (the direction of increasing
t). This definite integral exists for continuous F and piecewise smooth C, because this
makes For' piecewise continuous.
Line integrals (3) arise naturally in mechanics. where they give the work done by a
force F in a displacement along C (details and examples below). We may thus call the
line integral (3) the work integral. Other forms of the line integral will be discussed later
in this section.
E X AMP L E 1
Evaluation of a Line Integral in the Plane
Find the value of the line integral (3) when F(r) = [-y,
Fig. 218 from A to B.
Solution.
YL
R
tit) = em. t.
I)
We may represent C by ret) = [cos t. sin t] = cos t i
yet) = sin t. and
=
F(r(t»
-y(tH - x(t)y(t)j
By differentiation. r' (t) = [-sin t,
_cos t = II in the second term]
A_
1
Fig. 218.
f
=
f
=
f~"2 ~ (l -
cos t]
=
~in
t :'§ 71"/2. Then
-cos t sin t] = -sin t i-cos t sin t j.
t,
-sin t i + cos t j, so that by (3) [use (10) in App. 3.1; set
F(r) - dr
f
rr/2
rr/2
f OI/2(-dl/)
I
cos 2t) dt -
f- i
0 -
=
•
= 0.4521.
Line Integral in Space
The evaluation of line integrals in space is practically the same as it is in the plane. To see this. find the value
of (3) when F(r) = [~, x. y] = :::i + xj + yk and C is the helix (Fig. 219)
z
r(t) = [cos t. sin t. 3tJ = cos t i
(4)
Solution.
From (4) we have
lett)
Example 2
The dot product is 3t( -sin t)
+
f
2,,-
c
F(r)-dr =
f°
+ sin t j + 3tk
= cos t. y(t) = sin t, :::(t) =
+ costj +
F(r(t))-r'(t) = (3ti
Fig. 219.
= [-
+ sin t j, where U :'§
[-sin t. -cos t sin t[ - [-sin t, cos t] dt =
(sin2 t - cos2 t sin t) dt
c o o
x
Example 1
EXAMPLE 2
= -yi - xyj and C is the circular arc in
-xy]
cos 2 t
+ 3 sin t.
(-3tsint
+
3t. Thus
sintk)-(-sinti
+ costj + 3k).
Hence (3) gives
cos 2 t
+ 3sint)dt
=
671"
+
71"
+0
=
771"= 21.99.
•
Simple general properties of the line integral (3) follow directly from corresponding
properties of the definite integral in calculus, namely,
f kFodr = k f
(Sa)
c
(5b)
f
(F
c
+ G)°dr =
C
Fig. 220.
Formula (Sc)
(5c)
(k constant)
Fodr
f Fodr + f Godr
c
C
f Fodr = f Fodr + f Fodr
C
c,
C2
(Fig. 220)
SEC. 10.1
423
Line Integrals
where in (Sc) the path C is subdivided into two arcs C 1 and C2 that have the same
orientation as C (Fig. 220). In (Sb) the orientation of C is the same in all three integrals.
If the sense of integration along C is reversed, the value of the integral is multiplied by
-1. However, we note the following independence if the sense is preserved.
THEOREM 1
Direction-Preserving Parametric Transformations
Any representations of C that give the same positive direction on C also yield the
same value of the line integral (3).
PROOF
A proof follows by the chain rule, where ret) is the given representation, t = cp(t*) with
a positive derivative dtldt* is the transformation, with a* ~ t* ~ b* corresponding to
a ~ t ~ bin (3), and we write ret) = r(cp(t*» = r*(t*). Then dt = (dtldt*) dt* and
f
c
F(r*)odr*
dr dt
F(r(cp(t*») ° ~~ dt*
u"
dt dt*
f
=f
=
b*
dr
F(r(t»° - dt
dt
b
u
=
f
c
F(r) ° dr.
•
Motivation of the Line Integral (3):
Work Done by a Force
The work W done by a constant force F in the displacement along a straight segment d
is W = Fod; see Example 2 in Sec. 9.2. This suggests that we define the work W done
by a variable force F in the displacement along a curve C: ret) a~ the limit of sums of
works done in displacements along small chords of C. We show that this definition amounts
to defining W by the line integral (3).
For this we choose points to (= a) < tl < ... < tn (= b). Then the work LlWm done
by F(r(tm in the straight displacement from r(tm} to r(tm.+ 1) is
»
The sum ofthese n works is Wn = LlWo + ... + LlWn_1.lfwe choose points and consider
Wn for every II arbitrarily but so that the greatest Lltl11 approaches zero as n ---? 00, then
the limit of Wn as n ---700 is the line integral (3). This integral exists because of our general
assumption that F is continuous and C is piecewise smooth: this makes r' (t) continuous,
except at finitely many points where C may have comers or cusps.
•
E X AMP L E 3
Work Done by a Variable Force
IfF in Example I is a force. the work done by F in the displacement along the quarter-circle is 0.4521, measured
in snitable nnits, say, newton-meters (nt'm, also called joules, abbreviation J; see also front cover). Similarly in
Example 2.
•
424
E X AMP L E 4
CHAP. 10
Vector Integral Calculus. Integral Theorems
Work Done Equals the Gain in Kinetic Energy
Let F be a force, so that (3) is work. Let t be time, so that dr/dt
(6)
W=
f
v,
=
velocity. Then we can write (3) as
b
F·dr
J
=
F(r(t))·v(t)dt.
a
C
Now by Newton's second law (force = mass X acceleration),
F = mr"(t)
where
III
mv' (t),
=
is the mass of the body displaced. Substitution into (5) gives [see (11), Sec. 9.4]
b
J
W =
IIlV' •
Jb (v.v)' dt
vdt =
III
a
--
Ivl It~b
m
2
=
2
-
2
a
t~a
.
On the right, mlvl 2 /2 is the kinetic energy. Hence the work done equals the gain in kinetic energy. This is a
basic law in mechanics.
•
Other Forms of Line Integrals
The line integrals
(7)
are special cases of (3) when F = F1i or F 2 j or F3k, respectively.
Furthermore, without taking a dot product as in (3) we can obtain a line integral whose
value is a vector rather than a scalar, namely,
b
(8)
J F(r) dt = J F(r(t»
C
dt
=
a
f
b
[F1(r(t»,
F 2 (r(t»,
Obviously, a special case of (7) is obtained by taking Fl
f fer)
(8*)
F 3 (r(t»] dt.
a
C
dt =
f
=
f, F2
=
F3
= O. Then
b
f(r(t» dt
a
with C as in (2). The evaluation is similar to that before.
E X AMP L E 5
A Line Integral of the Form (8)
Integrate F(r) = [xy. yz, zJ along the helix in Example 2.
Solution.
F(r(t)) = [cos t sin t, 3t sin t, 3t] integrated with respect to t from 0 to 271" gives
2-,,-
fo F(rVJJ dt
[
=
-
~ cos2 t,
3 sin t - 3t cos t,
"23
2
t ]
127T
0
= [0,
•
Path Dependence
Path dependence of line integrals is practically and theoretically so important that we
formulate it as a theorem. And a whole section (Sec. 10.2) will be devoted to conditions
under which path dependence does not occur.
SEC 10.1
425
Line Integrals
THEOREM 2
r
Path Dependence
The line integral (3) generally depends not ollly all F alld all the endpoints A and
B of the path, but also on the path Use?! along which the integral is taken.
PROOF
Almost any example will show this. Take, for instance. the straight segment
C1: rl(t) = [t, t, 0] and the parabola C2: r 2(t) = [t, t 2, 0] with 0 ~ t ~ 1 (Fig. 22]) and
integrate F = [0, xy, 0]. Then F(r 1 (t»· rl (t) = t 2, F(r2(t»· r2(t) = 2t 4 , so that integration
gives L/3 and 2/5, respectively.
•
l~B
1
Fig. 221.
Proof of Theorem 2
...
11-121
Calculate
LINE INTEGRAL. WORK DONE
BY A FORCE
11. F = [ex, e Y , e Z ], r =
(2, 4, 4). Sketch C.
f
12. F = [y2,
2. F as in Prob. 1, C the shortest path from A to B. Is the
integral smaller? Give reason.
3. F as in Prob. 1, C from A straighL to (2. 0). then
vertically up to B
=
cos 2
:::],
C as in Prob. 7. Sketch C.
F(r)· dr for the following data. If F is a force.
c
this gives the work done in the displacement along C.
(Show the details.)
1. F = [y3, x 3], C the parabola y = 5x 2 from A: (0, 0)
to B: (2,20)
4. F
x 2,
[t, P, t2 ] from (0, 0, 0) to
[x 2, y2,
(-2.0), y
~
0], C the semicircle from (2, 0) to
0
5. F = [xy2, x~], C: r = [cosh t, sinh t, 0],
o ~ t ~ 2. Sketch C.
6. F = [ex, e Y ] clockwise along the circle with center
(0, 0) from (1, 0) to (0, -1)
7. F = [z, x, y], C: r
to (1, 0, 417)
=
[cos t,
sin t,
t] from (1, 0, 0)
13. WRITING PROJECT. From Definite Integrals to
Line Integrals. Write a short report (1-2 pages) with
examples on line integrals as generalizations of definite
integrals. The latter give the area under a curve. Explain
the corresponding geometric interpretation of a line
integral.
14. PROJECT. Independence of Representation.
Dependence on Path. Consider the integral
where F = [xy, _y2].
f
F(r)· dr,
C
(a) One path, several representations. Find the value
of the integral when r = [cos t, sin t], 0 ~ t ~ 1712.
Show that the value remains the same if you set t = - p
or t = p2 or apply two other parametric transformations
of your own choice.
(b) Several paths. Evaluate the integral when
C: y = x n , thus r = [t, t"l, 0 ~ t ~ 1, where
n = 1,2,3, .... Note that these infinitely many paths
have the same endpoints.
8. F = [coshx, sinhy. eZ ] . C: r = [t. P, t3 ] from
(0, 0, 0) to (!, ~)
9. F as in Prob. 8. C the straight segment from (0. O. 0)
to (!, ~)
(c) Limit. What is the limit in (b) as n --+ oo? Can you
confirm your result by direct integration without
referring to (b)?
10. F = [x, -z, 2y] from (0, 0, 0) straight to (1, 1,0),
then to (1, 1, 1), back to (0, 0, 0)
(d) Show path dependence with a simple example of
your choice involving two paths.
i,
i,
CHAP. 10
426
115-181
Vector Integral Calculus. Integral Theorems
INTEGRALS OF THE FORMS (8) AND (8*)
Evaluate (8) or (8*) with F or f and C as follows.
f = x 2 + y2, c: r = [t, 4t, 0], 0 ~ t ~ 1
16. f = 1 - sinh2 x, C the catenary r = [t, cosh t],
15.
0~t32
17. F
= [y2,
[3 cos t,
(L = Length of 0.
(9)
X2], C the helix
3 sin t, 2t], 0 3 t ~ 817
Z2,
20. Using (9), find a bound for the absolute value of the
work W done by the force F = [x 2 , y] in the
18. F = [(xy)1/3, (y/x) 1/3, 0], C the hypocycloid
r = [cos 3 t. sin 3 t, 0]. 0 ~ t ~ 17/4
10.2
19. (ML-Inequality, Estimation of Line Integrals) Let F
be a vector function defined on a curve C. Let IFI be
bounded. say. IFI ~ M on C, where M is some positive
number. Show that
displacement along the segment from
(0. Q)
to (3, 4).
Path Independence of Line Integrals
In this section we consider line integrals
(1)
Fig. 222. Path
independence
(dr
=
[£lx,
d.\',
dz])
as before, and we shall now find conditions under which (I) is path independent in a
domain D in space. By definition this means that for every pair of endpoints A, B in D
the integral (1) has the same value for all paths in D that begin at A and end at B. (See
Fig. 222. See Sec. 9.6 for "domain.")
Path independence is important. For instance, in mechanics it may mean that we have
to do the same amount of work regardless of the path to the mountaintop, be it short and
steep or long and gentle. Or it may mean that in releasing an elastic spring we get back
the work done in expanding it. Not all forces are of this type-think of swimming in a
big round pool in which the water is rotating as in a whirlpool.
We shall follow up three ideas that will give path independence of (1) in a domain D
if and only if:
= grad j (see Sec. 9.7 for the gradient).
(Theorem])
F
(Theorem 2)
Integration around closed curves C in D always gives O.
(Theorem 3)
curl F
=
0 (provided D is simply connected, as defined below).
Do you see that these theorems can help in understanding the examples and
counterexample just mentioned?
Let us begin our discussion with the following very practical criterion for path
independence.
THEOREM 1
Path Independence
A line integral (1) with continuous Fl , F2 , F3 ill a domain D in space is path
independent in D if and only ifF = [Flo F2 , F 3 ] is the gradient of some function
jill D,
(2)
F
= gradj,
thus.
SEC. 10.2
427
Path Independence of Line Integrals
PROOF
(a) We assume that (2) holds for some function .f in D and show that this implies path
independence. Let C be any path in D from any point A to any point B in D, given by
ret) = [x(t), yet), ::(t)], where a ~ t ~ b. Then from (2). the chain rule in Sec. 9.6, and
(3') in the last section we obtain
Ic
(F1dx
+
F2 dy
+
Ic (~f
F3 d::) =
cJx
~f
iJy
+
dx
dy
+
~f
iJz
d::)
Jb( - - + - - +aZ.- bdf
It=b
=I
=
c.t
=
af dx
ax dt
a
-f' dt
af dy
ay dt
af dZ)
dt
dt
f[x(t), yet), z(t)J
a
t=a
= f(x(b),
=
y(b), z(b)) - .f(x(a), yea), z(a)
feB) - f(A).
(b) The more complicated proof of the converse, that path independence implies (2)
for some f, is given in App. 4.
•
The last formula in part (a) of the proof,
J
B
(3)
(F] dx
+
F2 dy
+
F3 dz)
=
.f(B) - f(A)
[F
=
grad.f]
A
is the analog of the usual formula for definite integrals in calculus.
J
b
g(X)
dx
=
C(x)
a
Ib = G(b) -
C(a)
[C'(x) = g(x)].
a
Formula (3) should be applied whenever a line integral is independent of path.
Potential theory relates to our present discussion if we remember from Sec. 9.7 that f is
called a potential of F = grad f. Thus the integral (1) is independent of path in D if and
only if F is the gradient of a potential in D.
E X AMP L E 1
Path Independence
Show that the integrdl
f
F dr =
0
c
f
c
(2x dx
+ 2)' dy + 4;: dz) is path independent in any domain in space
and find its value in the integration from A: (0, O. 0) to B: (2. 2. 2).
Solution. F = [2y. 2)" 4;:] = grad i. where i =,\'2 + )'2 +
ai/a::. = 4.:: = F3 . Hence the integral is independent of
ai/ax = 2y = F I , ai/ay = 2)' = F2 ,
path according to Theorem I, and (3) gives
2;:2 because
I(B) - I(A) = i(l. 1. 1) - ItO. O. 0) = 4 + 4 + 8 = 16.
If you want to check this. use the most convenient path C: ret) = [I. I, I]. 0 ~ I ~ 1. on which
F(r(l) = [2/, 21, 411, so that F(r(/j) r'(I) = 21 + 21 + 41 = 8/. and integration ti-om 0 to 2 gives 8.22/2 = 16.
If you did not see the potential by inspection. use the method in the next example.
•
0
E X AMP L E 2
Path Independence. Determination of a Potential
Evaluate the integrall =
f
2
(3x dy
c
has a potential and applying (3).
+ 2)'.:: d)' +
y2 d:;;)
from A: (0, I, 2) to B: (I, - I, 7) by showing that F
428
CHAP. 10
Vector Integral Calculus. Integral Theorems
Solution.
If F has a potential
f.
we should have
Iy
= F2 = 2yz,
We show that we can satisfy these conditions. By integration of fx and differentiation,
I
= x
This gives f(x,
y,
3
+ g(y, z),
z) = x 3 +
I
y2;:
fy = gy = 2y;:,
g = y2Z
+
hi = 0
h = 0,
say.
I
h(;:,),
=
x3 +
y2 Z
+ h(::.)
and by (3),
= 1(1, -1, 7) -
f(O, 1, 2)
= 1 + 7 - (0 +
2)
= 6.
•
Path Independence and Integration
Around Closed Curves
The simple idea is that two paths with common endpoints (Fig. 223) make up a single
closed curve. This gives almost immediately
THEOREM 2
Path Independence
The integral (1) is path independent in a domain D if and only if its value around
ever}' closed path in D is zero.
PROOF
If we have path independence, then integration from A to B along C1 and along C2 in
Fig. 223. Proof of
Theorem 2
Fig. 223 gives the same value. Now C1 and C2 together make up a closed curve C, and
if we integrate from A along C1 to B as before, but then in the opposite sense along C2
back to A (so that this second integral is multiplied by -]), the sum of the two integrals
is zero, but this is the integral around the closed curve C.
Conversely, assume that the integral around any closed path C in D is zero. Given any
points A and B and any two curves C 1 and C2 from A to B in D, we see that C1 with the
orientation reversed and C2 together form a closed path C. By assumption, the integral
over C is zero. Hence the integrals over C] and C2 , both taken from A to B, must be equal.
This proves the theorem.
•
B
Work. Conservative and Nonconservative (Dissipative) Physical Systems
Recall from the last section that in mechanics, the integral (1) gives the work done by a
force F in the displacement of a body along the curve C. Then Theorem 2 states that work
is path independent in D if and only if its value is zero for displacement around every
closed path in D. Furthermore, Theorem] tells us that this happens if and only if F is the
gradient of a potential in D. In this case, F and the vector field defined by F are called
conservative in D because in this case mechanical energy is conserved; that i!>, no work
is done in the displacement from a point A and back to A. Similarly for the displacement
of an electrical charge (an electron, for instance) in a conservative electrostatic field.
Physically, the kinetic energy of a body can be interpreted as the ability of the body to
do work by virtue of its motion, and if the body moves in a conservative field of force,
after the completion of a round trip the body will return to its initial position with the
same kinetic energy it had originally. For instance, the gravitational force is conservative;
if we throw a ball vertically up, it will (if we assume air resistance to be negligible) return
to our hand with the same kinetic energy it had when it left our band.
SEC. 10.2
429
Path Independence of Line Integrals
Friction, air resistance, and water resistance always act against the direction of motion,
tending to diminish the total mechanical energy of a system (usually converting it into
heat or mechanical energy of the surrounding medium. or both), and if in the motion of
a body these forces are so large that they can no longer be neglected, then the resultant
F of the forces acting on the body is no longer conservative. Quite generally, a physical
system is called conservative if all the forces acting in it are conservati ve; otherwise it
is called non conservative or dissipative.
Path Independence and Exactness
of Differential Forms
Theorem I relates path independence of the line integral (I) to the gradient and Theorem
2 to integration around closed curves. A third idea (leading to Theorems 3* and 3, below)
relates path independence to the exactness of the differential form (or Pfaff/an f017l11)
(4)
under the integral sign in (1). This form (4) is called exact in a domain D in space if it
is the differential
df
af
af
af
= - £Ix + - dv + - d::. = (uradf)-dr
ax
ay'
az
to
of a differentiable function f(x, y, z) everywhere in D. that is, if we have
F-dr = df.
Comparing these two formulas. we see that the form (4) is exact if and only if there is a
differentiable function f(x, y, z) in D such that everywhere in D.
(5)
F = gradf,
thus,
Fl =
af
ax '
Hence Theorem l implies
THEOREM 3*
Path Independence
The integral (1) is path independent in a domain D in :,pace (f and only if the
d(fferentialfo171l (4) has continuous coefficient functions Flo F2 , F3 and is exact in D.
This theorem is practically important because there is a useful exactness criterion To
formulate the criterion, we need the following concept, which is of general interest.
A domain D is called simply connected if every closed curve in D can be continuously
shrunk to any point in D without leaving D.
For example, the interior of a sphere or a cube. the interior of a sphere with finitely
many points removed. and the domain between two concentric spheres are simply
IJOHANN FRIEDRICH PFAFF (1765-1825), German mathematician.
CHAP. 10
430
Vector Integral Calculus. Integral Theorems
connected. while the interior of a torus (a doughnut; see Fig. 247 in Sec. 10.6) and the
interior of a cube with one space diagonal removed are not simply connected.
The criterion for exactness (and path independence by Theorem 3*) is now as follows.
THEOREM 3
Criterion for Exactness and Path Independence
Let F b F2' F3 in the line integral (I),
fc F(r)"dr = fc
(PI dx
+
F2 d. . .
+
F3 d::.),
be contillllOllS and have cominuous first partial derivatives ill a domain D in space.
Then:
(a) lfthe differel1Tialform (4) is eX({Ci ill D-al1d thus (I) is path independent
by Theorem 3*-, then in D,
(6)
curl F
= 0;
ill components (see Sec. 9.9)
(6')
(b) If (6) holds in D and D is simply connected. thell (4) is exact in D-and
thus (I) is path independent by Theorem :1*.
PROOF
(a) If (4) is exact in D, then F = grad f in D by Theorem 3*, and, furthermore,
curl F = curl (grad.f) = 0 by (2) in Sec. 9.9, so that (6) holds.
(b) The proof needs "Stokes's theorem"" and will he given in Sec. 10.9.
•
f F( r) "dr = f
(F I dx + F2 dy) the curl has only one
c
c
component (the z-component), so that (6') reduces to the single relation
Line Integral in the Plane.
For
(6")
(which also occurs in (5) of Sec. 1.4 on exact ODEs).
E X AMP L E 3
Exactness and Independence of Path. Determination of a Potential
Using (6'), show that the differential form under the integral sign of
is exact, so that we have independence of path in any domain, and find the value of I from A: (0, 0, 1)
to B: (l, 7r/4, 2).
SEC. 10.2
431
Path Independence of Line Integrals
Solution.
Exactness follows from (6'), which gives
2
(F3 )y = 2x z
+
=
(Fl)z = 4xyz
(F2 )x
To find
and F 3 ,
J,
=
2xz
2
cosyz - yzsinyz = (F2 )z
(F3 )x
(F1)y'
=
we integrate F2 (which is "long," so that we save work) and then differentiate to compare with Fl
fz
=
2
2x zy
+ Y cos yz +
=
h'
2
2x zy
=
F3
+ Y cos yz,
h' = O.
h' = 0 implies h = const and we can take h = 0, so that g = 0 in the first line. This gives, by (3),
f(x, y, z) = x 2YZ 2 -t sin yz,
7T
7T
4 . 4 + sin 2 -
feB) - f(A) = 1 .
0
=
7T
-t
•
1.
The assumption in Theorem 3 that D is simply connected is essential and cannot be omitted.
Perhaps the simplest example to see this is the following.
E X AMP L E 4
On the Assumption of Simple Connectedness in Theorem 3
Let
x
Y
F] = - - 2 - - 2 '
X
+Y
(7)
F2 = - 2 - - 2 '
x +Y
F3 =
o.
Differentiation shows that (6') is satisfied in any domain of the xy-plane not containing the origin. for example,
in the domain D: ~ <
~ + ; < ~ shown in Fig. 224. Indeed, Fl and F2 do not depend on z, and F3
so that the first two relations in (6') are trivially true. and the third is verified by differentiation:
V
aF2
ax
x 2 +y2-x·2x
y
+
(x2
(x2
x2
aFl
ay
y2)2
+
y2 - y-2y
(x 2 + V2 )2
2
=
0,
- x2
+ y2)2
y
2
(x2
- x2
+ y2)2
Clearly, D in Fig. 224 is not simply connected. If the integral
1=
f
c
(Fl dx
+
F2 dy)
=
f
C
-ydx + xdy
2
x + y2
were independent of path in D, then I = 0 on any closed curve in D, for example, on the circle x 2
But setting x = r cos 8, y = r sin e and noting that the circle is represented by r = I, we have
x=cose.
so that -y dx
+ x dy
=
2
sin e de
dx = -sin ed8,
+ cos2 8 de
= de
1=
f
y=sine.
+
y2 = 1.
dy = cos ede,
and counterclockwise integration gives
2.,,-
o
de =
I
27T.
Since D is not simply connected. we cannot apply Theorem 3 and cannot conclude that I is independent of path
in D.
Although F = grad f, where f = arctan (ylx) (verify!), we cannot apply Theorem I either because the polar
angle f = 8 = arctan (y Ix) is not single-valued, as it is required for a function in calculus.
•
CHAP. 10
432
Vector Integral Calculus. Integral Theorems
y
3
;1
Example 4
Fig. 224.
....
,,-- -.-. ....- .
-
11-81
PATH-INDEPENDENT INTEGRALS
Show that the fonn under the integral sign is exact in the
plane (Probs. 1-4) or in space (Probs. 5-8) and evaluate
the integral. (Show the details of your work.)
1.
f
f
(4."./8)
(y
+x
cosxJ dx
(c) Integrate from (0, 0) along the straight-line
segment to (c, I), 0 ~ c ~ I, and then horizontally to
(1, I). For c = I, do you get the same value as for
b = I in (b)? For which c is I maximum? What is its
maximum value?
cosxy dy)
y
(0.0>
2.
x
(c,l)
1
(0,5)
2
(y e
2x dx
ye 2x dy)
+
(5.0)
3.
fO,l)
e- X2 _ y2 (x dx
+
(1, b)
y dy)
(-1,-1)
4.
f
<0.0)
(6.w)
(cos 2 Y dx - 2x cos y sin y dy)
(2,0)
5.
f
f
f
f
111-191
(O,1,2)
(z e Xz dx
+
dy
+
xe
xz
dz)
+
y dy -
d;:.)
(0.0.0)
7.
(7.8.0)
(2xy dx
+
x 2 dy
+
sinh z £Iz)
0.0.0)
8.
[2X(y3 - Z3) dx
+
3x 2 )'2 dy - 3x 2Z 2 dz]
9. Show thar in Example 4 of the text we have
F = grad (arctan (ylx». Give examples of domains in
which the integral is path independent.
10. PROJECT. Path Dependence. (a) Show that
1
(x\ dx
c
xy-plane.
=
+
2xy2 dy) is path dependent in the
(b) Integrate from (0. 0) along the straight-line
segment to (1. b). 0 ~ b ~ I, and then vertically up to
(I, I); see the figure. For which of these paths is I
maximum? What is its maximum value?
t'
dz)
12. (3x 2e 2Y
+ x) dx + 2x 3e 2Y dy
2
13. 3x y {lY + x 3 dy + Y d:
14. 2x sin J dx + x 2 cosy dy + y2 dz
15. (ze
e Y ) dx - xe Y dy + eX dz
16. eX cos 2.1' dx - 2e x sin 2y dy - xz dz
17. xy Z2 d.1. + !X 2Z2 dy + x 2)'z do;:.
18. yz cosh x dx + Z sinh x dy + J sinh x dz
19. Y dt' + (x - 2y) dy + 4x dz
X
(4.4,0>
(2.0,1>
I
CHECK FOR PATH INDEPENDENCE
11. (cosh x::)(:: dx +
O .1 •0 )
ex2+y2-2z (x dx
x
and, if independent, integrare from (0, 0, 0) to (a, b, c).
(2,3.0)
6.
1
Project 10. Path Dependence
-
20. WRITING PROJECT. Ideas on Path Independence.
Make a list of the main ideas on path independence
and dependence in this section. Then work this list into
an essay. including explanations of all definitions and
on the practical usefulness of the theorems, but no
proofs. Include illustrating examples of your own.
Explain what happens in Example 4 if you take the
domain 0 <
Vr
+ y2 < ~.
SEC. 10.3
10.3
Calculus Review: Double Integrals.
Optional
433
Calculus Review: Double Integrals.
Optional
Students familiar with double integrals from calculus should go on to the next
section, skipping the present review, which is included to make the book reasonably
self-contained.
In a definite integral (1), Sec. 10.1, we integrate a function f(x) over an interval (a
segment) of the x-axis. In a double integral we integrate a function f(x, y), called the
integrand, over a closed bounded region2 R in the xy-plane, whose boundary curve has a
unique tangent at each point, but may perhaps have finitely many cusps (such as the
vertices of a triangle or rectangle).
The definition of the double integral is quite similar to that of the definite integral.
We subdivide the region R by drawing parallels to the x- and y-axes (Fig. 225). We
number the rectangles that are entirely within R from 1 to n. In each such rectangle we
choose a point, say, (Xk, Yk) in the kth rectangle, whose area we denote by LlA k. Then
we form the sum
n
in =
2:
f(xk, Yk) LlAk-
k~l
This we do for larger and larger positive integers II in a completely independent manner,
but so that the length of the maximum diagonal of the rectangles approaches zero as n
approaches infinity. In this fashion we obtain a sequence of real numbers i n" i n2 , . . . .
Assuming that f(x, y) is continuous in Rand R is bounded by finitely many smooth
curves (see Sec. 10.1), one can show (see Ref. [GR4] in App. 1) that this sequence
converges and its limit is independent of the choice of subdivisions and corresponding
points (xk, Yk). This limit is called the double integral of f(x, y) over the region R, and
is denoted by
f ff(x, y) dxdy
R
or
f ff(x, y) dA.
R
y
x
Fig. 225.
Subdivision of a region R
2 A region R is a domain (Sec. 9.6) plus, perhaps, some or all of its boundary points. R is closed if its boundary
(all its boundary points) are regarded as helonging to R; and R is bounded if it can be enclosed in a circle of
sufficiently large radius. A boundary point P of R is a point (of R or not) such that every disk with center P
contains points of R and also points not of R.
434
CHAP. 10
Vector Integral Calculus. Integral Theorems
Double integrals have properties quite similar to those of definite integrals. Indeed, for
any functions f and g of (x, y), defined and continuous in a region R,
f
f f kf dx dy = k f f
R
(1)
ff(f
+
(k constant)
dx dy
R
g)dxdy
= fffdxdy+ ffgdxdy
R
R
f ffdxdy
R
f ffdxdy
=
R
+ f ffdxdy
Rl
(Fig. 226).
R2
Furthermore, if R is simply connected (see Sec. 10.2), then there exists at lea')t one point
(xo, Yo) in R such that we have
(2)
f f f(x, y) dx dy
= f(xo, yo)A·
R
where A is the area of R. This is called the mean value theorem for double integrals.
Fig.
n6. Formula (l)
Evaluation of Double Integrals
by Two Successive Integrations
Double integrals over a region R may be evaluated by two successive integrations. We
may integrate first over y and then over x. Then the formula is
(3)
R
hex)
b [
f f f(x, y) dx dy = f
f
a
]
f(x, y) dy
dx
(Fig. 227).
g(x)
Here y = g(x) and y = hex) represent the boundary curve of R (see Fig. 227) and, keeping
x constant, we integrate f(x, y) over y from g(x) to hex). The result is a function of x. and
we integrate it from x = a to x = b (Fig. 227).
Similarly, for integrating first over x and then over y the formula is
(4)
f ff(x,y)dxdy
R
d[
= f
f
C
q(y)
p(y)
f(x,y)dx
J
dy
(Fig. 228).
SEC. 10.3
435
Optional
Calculus Review: Double Integrals.
y
y
d
h(X)~J
C
I
:
I
I
R
:
I
I
,---:
I
a
Fig. 227.
<
g(x) I
b
---------7)
P(y)~
R
----c~
/
'--"y)
x
x
Evaluation of a double integral
Fig. 228.
Evaluation of a double integral
The boundary curve of R is now represented by x = p(y) and x = q(y). Treating y as a
constant, we first integrate f(x. y) over x from p(y) to q(y) (see Fig. 228) and then the
resulting function of y from y = c to y = d.
In (3) we assumed that R can be given by inequalities a ~ x ~ b and g(x) ~ y ~ hex).
Similarly in (4) by c ~ y ~ d and p(y) ~ x ~ q(y). If a region R has no such representation,
then in any practical case it will at least be possible to subdivide R into finitely many
portions each of which can be given by those inequalities. Then we integrate f(x, y) over
each portion and take the sum of the results. This will give the value of the integral of
f(x. y) over the entire region R.
Applications of Double Integrals
Double integrals have various physical and geometric applications. For instance. the area
A of a region R in the xy-plane is given by the double integral
A
=
II
dxdy.
R
The volume V beneath the surface
is (Fig. 229)
z=
f(x, y) (> 0) and above a region R in the xy-plane
V= f ff(x,y)dxdy
R
because the term f(Xk, Yk) ilAk in in at the beginning of this section represents the volume
of a rectangular box with base of area ilAk and altitude f(xk, Yk)'
z
x
'.
Fig. 229.
-
Double integral as volume
y
436
CHAP. 10
Vector Integral Calculus. Integral Theorems
As another application. let f(x, y) be the density (= mass per unit area) of a distribution
of mass in the \)·-plane. Then the total mass M in R is
M
II
=
f(x. y) dx dy:
R
the center of gravity of the mass in R has the coordinates X,
x
=
~
~
y=
and
I I xf(x, y) dx dy
R
y,
where
I I yf(x, y) dx dy;
R
the moments of inertia Ix and Iy of the mass in R about the x- and y-axes, respectively, are
Ix = I I y2f(x, y) dx dy,
II
Iy =
R
x 2 f(x. y) dx dy;
R
and the polar moment of inertia 10 about the origin of the mass in R is
10
=
+ Iy =
Ix
I I(x 2
+ y2)f(x, y) dx dy.
R
An example is given below.
Change of Variables
In
Double Integrals. Jacobian
Practical problems often require a change of the variables of integration in double integrals.
Recall from calculus that for a definite integral the formula for the change from x to u is
b
(5)
Ia
f(x) dx
=
I/3
0'
f(x(u»)
dx
duo
du
_0
Here we assume that x = x(u) is continuous and has a continuous derivative in some
interval a ~ II ~ f3 such that x(a) = G, x(f3) = b [or x(a) = b. x(f3) = G] and X(ll) varies
between G and b when u varies between a and f3.
The formula for a change of variables in double integrals from x, y to ll, U is
(6)
IRI
f(x, y) dx d.\' =
IR*I
f(x(u, u), y(u, u))
a(x, y)
I-a(u,-u)-
Idu du;
that is, the integrand is expressed in terms of u and u, and dx dv is replaced by du du times
the absolute value of the Jacobian 3
(7)
J=
ax
ax
B(x, y)
au
au
ax
ay
ax
ay
a(u. u)
ay
ay
all au
iJu
au
au
au
---
3 Named after the German mathematician CARL GUSTAV JACOB JACOBI (1804-1851), known for his
contributions to elliptic functions. partial differential equations, and mechanics.
SEC. 10.3
Calculus Review: Double Integrals.
Optional
437
Here we assume the fol1owing. The functions
x = .Y(u, u),
y
=
y(u, u)
effecting the change are continuous and have continuous partial derivatives in some region
R* in the uu-plane such that for every (u, u) in R* the corresponding point (.Y, y) lies in
R and, conversely, to every (x, y) in R there corresponds one and only one (u, v) in R*;
furthermore, the Jacobian J is either positive throughout R* or negative throughout R*.
For a proof. see Ref. [GR4] in App. 1.
E X AMP L E 1
Change of Variables in a Double Integral
Evaluate the following double integral over the square R in Fig. 230.
Solution. The shape of R suggest~ the transfonnation x
J = !(u - v). The Jacobian is
J =
R corresponds to the square 0
II
~
u
2
~
2, 0
a(x, y) =
a(ll, v)
~
v
~
I! !I
1
=
_1
2
=
v. Then x
Y=
ll, X -
= !lu
+ v),
-!.. .
2
2
2. Therefore,
f2f21
2
(x -t Y ) dx dy =
-
R
+ y
00
2
(u
2
1
2
+ v ) - du dv
8
= -
2
.
3
•
y
x
Fig. 230.
Region R in Example 1
Of pmticular practical interest are polar coordinates r and ti, which can be introduced
by setting x = r cos ti, y = r sin e. Then
J
=
a(x, y) =
a(r,
Icos e
-r sin
el
e
r cos
e
e)
sin
= r
and
(8)
JJf(x, y) dx dy JJfer cos e. r sin e) r dr de
=
R
R*
where R* is the region in the re-plane corresponding to R in the xy-plane.
CHAP. 10
438
EXAMPLE 2
Vector Integral Calculus. Integral Theorems
Double Integrals in Polar Coordinates. Center of Gravity. Moments of Inertia
Let f(x, y) == I be the mass density in the region in Fig. 231. Find the total mass, the center of gravity, and the
moment' of inertia lx, Iy, 10 ,
y~
Solution. We use the polar coordinates just defined and formula (8), This gives the total mass
x
1
Fig. 231.
The center of gravity has the coordinares
Example 2
4
x = 71"
I "2
f
7
0
4
y=-
I
0
1.,,/2 1 cos fI dfl
4
r cos B r dr dB = 71"
-
3
0
4
=
3..,-
=
0.4244
for reasons of symmetry.
371"
The moments of inertia are
Ix
=
II
If
."./2
y2 dx dy =
1
2
=
r2 sin B r dr dB
ROO
."./2
={
1T
1=y
16
Why are
I
."./2
~ sin 2 B dB
0
i(1-
cos28)dB=
i (-i -
0) = I: = 0.1963
IT
for reasons of symmetry,
10 = Ix + ly
= "8 = 0.3927.
•
x and y less than~?
This is the end of our review on double integrals. These integrals will be needed in this
chapter. beginning in the next section.
•
• w ...... _
n
...
w,oc::
,_IiI_
___
_._
1. (Mean value theorem) lllustrate (2) with an example.
12-91
DOUBLE INTEGRALS
Describe the region of integration and evaluate. (Show the
details.)
ff
1
2.
o
3.
(x
+ y)2 dy
10 Ix"(1 -
11
o
2xy) dy dx
y
cosh tx
+ y) dx
dy
0
6. As Prob. 5, order reversed
4
7.
1J
o
7T'/2
9.
o
x
2
)'
dy dx
I-x
sin y
eX cos
y dx dy
0
X
-x
e
x 2y
+
dy dx
y2
10. Integrate xye _
over the triangular region with
vertices (0. 0). O. I). (1. 2).
4. As Prob. 3, order reversed
5.
o
X2
dx
x
3
I-x"
ff
I f
2x
x
1
1
8.
111-131
VOLUME
Find the volume of the following regions in space.
11. The region beneath z = x 2 + y2 and above the square
with vertices (1, I), (-I, 1), (-1, -I), (1, -1)
12.
The tetrahedron cut from the first octant by the plane
~x + 2)' + z = 6. Check by vector methods.
13. The first octant section cut from the region inside the
+ Z2 = 1 by the planes y = z = 0, x = y.
cylinder
r
0,
SEC. lOA
439
Green's Theorem in the Plane
[14-161 CENTER OF GRAVITY
Find the center of gravity (x, y) of a mass of density
j(x, y) = I in the given region R.
14. R the semi disk x 2 + y2 ~ a 2 , y ~ 0
15.
engineer is likely to need. along with other profiles listed
in engineering handbooks).
18. R as in Prob. 16.
17. R as in Prob. 15.
y
19.
'~
h
2+----..
h~
x
b
x
y
20.
h+--.....
117-201 MOMENTS OF INERTIA
Find the moments of ineltia Ix, Iy , 10 of a mass of density
j(x, y) = 1 in the region R shown in the figures (which the
10.4
o
a
2
a
2
x
Green's Theorem in the Plane
Double integrals over a plane region may be transformed into line integrals over the
boundary of the region and conversely. This is of practical interest because it may simplify
the evaluation of an integral. It also helps in the theory whenever we want to switch from
one kind of integral to the other. The transformation can be done by the following theorem.
THEOREM 1
Green's Theorem in the Plane4
(Transformation between Double Integrals and Line Integrals)
Let R be a closed bounded region tsee Sec. 10.3) in the xy-plane whose boundary
C consists offinitely many smooth curves (see Sec. 10.1). Let FI(x, y) and F2(x, y)
befunctions that are continuous and have continuous partial derivatives aF I lay and
aF2/ax everywhere in some domain containing R. Then
(1)
II (
R
aF2
aFI )
1 WI dx + F2 dy).
- - - - - dx:dy =
ax
ay
c
r
Here we integrate along the entire boundary C of R il1 such a sense that R is
the left as we advance ill the direction of illlegratioll (see Fig. 232 on p. 440).
011
4GEORGE GREEN (1793-1841), English mathematician who was self-educated, started out as a baker, and
at his death was fellow of Caius College, Cambridge. His work concemed potential theory in connection with
electricity and magnetism, vibrations, waves, and elasticity theory. It remained almost unknown. even in England.
until after his death.
A "domain containing R" in the theorem guarantees that the assumptions about FI and F2 at boundary poims
of R are the same as at other poims of R.
CHAP. 10
440
Vector Integral Calculus. Integral Theorems
y
x
Fig. 232. Region R whose boundary C consists of two parts:
C1 is traversed counterclockwise, while C2 is traversed
clockwise in such a way that R is on the left for both curves
Setting F = [Fl' F 2 ] = Fli + F 2 j and using (1) in Sec. 9.9, we obtain (1) in vectorial
form,
I I (curl F)-k
(1')
dxdy
f F-dr.
=
R
C
The proof follows after the first example. For if> see Sec. 10.1.
E X AMP L E 1
Verification of Green's Theorem in the Plane
Green's theorem in the plane will be quite important in our further work. Before proving it. let us get used to
2
it by verifying it for Fl = -,,2 - ?y, F2 = 2\")' + 2x and C the circle x + .1'2 = l.
Solutioll.
In (1) on the left we get
II(
ilF2
~
R
-
iJFl)
~
dxdy =
II
.
[(2)'
+ 2) - (2y - 7)ldxdy
= 9
II
dxdy = 911
R
R
since the circular disk R has area 7r.
We now show that the line integral in (1) on the right gives the same value, 97r. We must orient C
counterclockwise, say. ret) = [cos t, sin tl. Then r' (t) = [-sin t, cos tl. and on C,
F2
=
2xy
+ 2x =
2 cos t sin t
+ 2 cos t.
Hence the line integral in (1) becomes, verifYing Green's theorem,
~
c
f
f (2.,,-
(FiX'
+
F 2 y') dt =
[(sin 2 t - 7 sin t)( -sin t) + 2(cos t sin t + cos t)(cos
t)J dt
0
2.,,-
=
2
sin3 t + 7 sin2 t + 2 cos t sin t
o
=
PROOF
0
+ 2 cos 2 t) dt
•
+ 711 - 0 + 27r = 97r.
We prove Green's theorem in the plane. first for a special region R that can be represented
in both forms
a
~
x
~
b,
~
y
~
vex)
(Fig. 233)
p(y) ~ x ~ q(y)
(Fig. 234).
U(x)
and
c~)'~d,
SEC. 10.4
441
Green's Theorem in the Plane
y
u(x)
a
Fig. 233.
x
b
x
Example of a special region
Fig. 234.
Example of a special region
Using (3) in the last section, we obtain for the second term on the left side of (1) taken without
the minus sign
JJ-~ dx(~v = Jb[ J
aF
(2)
ay
R
aF
vex)
a
]
_1
u(x)
ay
dy
lix
(see Fig. 233).
(The first term will be considered later.) We integrate the inner integral:
JvCx)
aFt
IY=V(Xl
dy
ay
u(x)
=
=
FI(x. y)
FI[X, V(X)] - FI[x, u(x)].
y=u(xl
By inserting this into (2) we find (changing a direction of integration)
I J__
aF
1
ay
R
dl: dy
=
I
b
FI[x. vex)] dx -
a
= -
I
b
F1[x. u(x)] dx
a
f
a
J
b
F 1 [x, VeX)] lix -
F 1 [x, u(x)] lix.
a
b
Since y = vex) represents the curve C** (Fig. 233) and y = u(x) represents C*, the last
two integrals may be written as line integrals over C** and C* (oriented as in Fig. 233);
therefore,
(3)
JJ aF]ay dx dy = - f
R
FI(x. y) dx -
C**
= -~
c
f
FI(x. y) dx
C*
FI(x, y) dx.
This proves (I) in Green's theorem if F2 = O.
The result remains valid if C has portions parallel to the y-axis (such as C and Cin
Fig. 235). Indeed, the integrals over these portions are zero because in (3) on the right we
integrate with respect to x. Hence we may add these integrals to the integrals over C* and
C** to obtain the integral over the whole boundary C in (3).
We now treat the first term in (I) on the left in the same way. Instead of (3) in the last
section we use (4), and the second representation of the special region (see Fig. 234).
Then (again changing a direction of integration)
442
CHAP. 10
Vector Integral Calculus. Integral Theorems
aF
I I_2
ax
R
dx dy
Id[ Iq(Y)aFax
=I
=
_2
c
dx
]
dy
p(y)
d
+
F2(q(y), y) dy
I
c
F2(P(Y), y) dy
d
c
y
y
x
Fig. 235.
x
Proof of Green's theorem
Fig. 236.
Proof of Green's theorem
Together with (3) this gives (1) and proves Green's theorem for special regions.
We now prove the theorem for a region R that itself is not a special region but can be
subdivided into finitely many special regions (Fig. 236). In this case we apply the theorem
to each subregion and then add the results; the left-hand members add up to the integral
over R while the right-hand members add up to the line integral over C plus integrals over
the curves introduced for subdividing R. The simple key observation now is that each of
the latter integrals occurs twice. taken once in each direction. Hence they cancel each
other, leaving us with the line integral over C.
The proof thus far covers all regions that are of interest in practical problems. To prove
the theorem for a most general region R satisfying the conditions in the theorem, we must
approximate R by a region of the type just considered and then use a limiting process.
For details of this see Ref. [GR4] in App. 1.
•
Some Applications of Green's Theorem
E X AMP L E 2
Area of a plane Region as a Line Integral Over the Boundary
In (I) we first choose Fl = 0, F2 = x and then Fl = -y, F2 = O. This gives
II
R
dxdy = fXdY
and
C
respectively. The double integral is the area A of R. By addition we have
(4)
A
=
2.2 fc (xdy -
ydr;)
where we integrate as indicated in Green's theorem. This interesting formula expresses the area of R in terms
of a line integral over the boundary. It is used, for instance, in the theory of certain planimeters (mechanical
instruments for measuring area). See also Prob. 17.
SEC lOA
443
Green's Theorem in the Plane
For an ellipse x 2/0 2 + y2/b 2 = I or x = 0 cos t, Y = b sin t we get x' = - 0 sin t,),' = b cos t; thus from
(4) we obtain the familiar fonnula for the area of the region bounded by an ellipse,
2w
A
E X AMP L E 3
=
'
2I0f(x)'
2'71'"
,
- yx ) dt =
2"If[
0
ab cos2 t
Area of a Plane Region in Polar Coordinates
= r cos e. y = r sin e. Then
Let rand e be polar coordinates defined by x
dx = cos edr - rsin ede,
and
(4)
dy
=
sin edr
+
rCos ede.
becomes a formula that is well known from calculm.. namely,
A = -I
(5)
2
f'
c
r 2 de.
As an application of (5), we consider the cardioid r = a(l - cos
2
A
E X AMP L E 4
•
- (-ob sin2]
t) dt = Trab
= -a
2
f
e), where 0
~
e~
2Tr (Fig. 237). We find
2".
(l - cos e)2 de
= -3Tr
2
0
•
02
Transformation of a Double Integral of the Laplacian of a Function
into a Line Integral of Its Normal Derivative
The Laplacian plays an imponanl role in physics and engineering. A first impression of this was obtained in
Sec. 9.7, and we shall discuss this further in Chap. 12. At present, let us use Green's theorem for deriving a
basic integral formula involving the Laplacian.
We take a function w(x. y) that is conti nuous and has continuous first and second partial derivatives in a
domain of the xy·plane containing a region R of the type indicated in Green's theorem. We set FI = -aw/ay
and F2 = aw/ax. Then aFI/ay and aF2 /ax are continuous in R. and in (I) on the left we obtain
(6)
the Laplacian ofw (see Sec. 9.7). Furthermore. using those expressions for FI and F2 • we get in (I) on the right
(7)
f·WI
c
dx
+ F2 dy)
=
f' (d.x +
c
f' (
dl' )
F2 -d'
ds =
sSe
FI -d
-
aw
d.x
ay
ds
-;- -
dV)
+ -au' --"ax
ds
ds
where s is the arc length of C, and C is oriented as shown in Fig. 238. The integrand of the last integral may
be written as the dot product
(8)
(grad wJon =
aw] [dV
[ aw
ax
ay
ds
- , -;-
0
--"-,
-
-
d.x]
aw
dy
aw
d.x
ds
ax
ds
CJy
ds'
y
y
x
x
Fig. 237.
Cardioid
Fig. 238.
Example 4
CHAP. 10
444
Vector Integral Calculus. Integral Theorems
The vector n IS a untt normal vector to C, because the vector r' (s) = drlds = [dxlds, dvldsl is the unit
tangent vector of C. and r' • n = O. so that n is perpendicular to r'. Also, n is directed to the exterior of C
because in Fig. 238 the positive x-component dxlds of r' is the negative y-component of n. and similarly at
other points. From this and (4) in Sec. 9.7 we see that the left side of (8) is the derivative of u' in the direction
of the outward normal of C. This derivative is called the normal derivative of w and is denoted by awlall:
that is. au-Iall = (grad w)· n. Because of (6), (7), and (8). Green's theorem gives the desired formula relating
the Laplacian to the normal derivative.
II V
(9)
2
wdxdy
=
R
fc awan ds.
2
For instance. \I" = x - y2 sati~fies Laplace's equation ,2w = O. Hence its nomml derivative integrated over
a closed curve must give O. Can you verify this directly by integration. say. for the square 0 -<:: x -<:: 1.
0-<::)" -<:: I?
•
Green's theorem in the plane may facilitate the evaluation of integrals and can be used in
both directions, depending on the kind of integral that is simpler in a concrete case. This
is illustrated further in the problem set. Moreover, and perhaps more fundamentally,
Green's theorem will be the essential tool in the proof of a very important integral theorem,
namely, Stokes's theorem in Sec. 10.9.
• ; ITM'=SE 'F-:)-O
11-121
_]1-== ____
EVALUATION OF LINE INTEGRALS
BY GREEN'S THEOREM
Using Green's theorem, evaluate
f
F(r)-drcounterclockwise
c
around the boundary curve C of the region R, where
1. F = l~XV4, ~x~l. R the rectangle with vertice~ (0. 0).
(3, 0)' (3, 2), (0. 2)
2. F = [y sin x. 2x cos y]. R the square with vertices
(0, 0), (~7T. 0). ~7T, !7T). (0. ~7T)
3. F = [_y3. x 3], C the circle x 2 + )"2 = 25
4. F = [-eY •
eX]. R the triangle with vertices (0. 0).
(2, 0). (2. l)
5. F = [e x + y • eX 0. 1),0.2)
Y ].
R the triangle with vertices (0. 0).
6. F = [x cosh y.
x 2 sinh y]. R: x 2 ~ y ~ x. Sketch R.
7. F = [x 2
x 2 - )'2], R: 1 ~ Y ~ 2 - x 2. Sketch
+
y2.
113--161
INTEGRAL OF THE NORMAL DERIVATIVE
Using (9). evaluate 1, ~w ds counterclockwise over the
Jc [In
boundary curve C of the region R.
13. w = sinh x, R the triangle with vertices (0, 0), (2, 0),
(2, 1)
14. w = t 2 + )'2, C: x 2
direct integration.
15. w = 2 In (x 2 + y2)
16. w = x 6y
+
xy6,
+ .\'2
= l.
Confirm the answer by
+ xy3, R: 1 :;S )' ~ 5
R: x 2 + y2 ~ 4, " ~ 0
- x 2• X ~ 0
17. CAS EXPERIMENT. Apply (4) to figures of your
choice whose area can also be obtained by another
method and compare the results.
18. (Laplace's equation) Show that for a solution w(x, y)
of Laplace's equation \,2U- = 0 in a region R with
boundary curve C and outer unit normal vector n,
R.
8. F
x2
=
[eX cosy.
+ y2
-ex siny]. R the semidisk
~ a 2• x ~ 0
(10)
9. F = grad (x 3 cos 2 (xv)), R the region in Prob. 7
10. F = [x In y.
yeo,,]. R the rectangle with vertices (0. I).
(3. 1), (3. 2). (0. 2)
11. F = [2\" - 3y. x
+ 5y]. R:
16x2
+ 25."2 ~ 400. y ~ 0
12. F = [x~,2. -xly2]. R: I ~ x 2 + y2 ~ 4. x ~ 0,
y ~ x. Sketch R.
1,
=
dw
Jc w an
ds.
19. Show that w = 2e x cos)' satisfies Laplace's equation
V2 w = 0 and. using (0), integrate w(ilwldll)
counterclockwise around the boundary curve C of the
square 0 ~ x ~ 2, 0 ~ y ~ 2.
SEC. 10.5
445
Surfaces for Surface Integrals
20. PROJECT. Other Forms of Green's Theorem in
the Plane. Let Rand C be as in Green's theorem, r'
a unit tangent vector. and n the outer unit normal vector
of C (Fig. 238 in Example 4). Show that (1) may be
written
(11)
I I div F dx dy
R
10.5
=
f
or
(12)
I I(CUrIF)OkdXdy
R
=
f
For' ds
C
where k is a unit vector perpendicular to the xy-plane.
Velify (11) and (12) for F = [7x, - 3)'] and C the circle
x 2 + )'2 = 4 as well as for an example of your own
choice.
F n ds
0
C
Surfaces for Surface Integrals
Having introduced dquble integrals over regions in the plane, we turn next to surface
integrals, in which we integrate over surfaces in space, such as a sphere or a portion of a
cylinder. For this we must first see how to represent a surface. And we must discuss
surface normals, since they are also needed in surface integrals. For simplicity we shall
say "surface" also for a portion of a surface.
Representation of Surfaces
Representations of a surface S in xyz-space are
z
(1)
f(x. y)
=
g(x, y, z) =
or
o.
For example, z
+Va 2 - x 2 - y2 or x 2 + y2 + Z2 - a 2 = 0 (z ~ 0) represents a
hemisphere of radius a and center O.
Now for cun'es C in line integrals. it was more practical and gave greater flexibility to
use a parametric representation r = r(t). where a ~ t ~ b. This is a mapping of the interval
a ~ t ~ b, located on the t-axis, onto the curve C (actually a portion of it) in x)'z-space.
It maps every t in that interval onto the point of C with position vector ret). See Fig. 239A.
yr(t)
~ec
~
x
/d
in space
Z
I/
n-
y
/'
r(u,v)
SurfaceS
in space
v
-I
1
a
b
(t-axis)
u
(A) Curve
Fig. 239.
(E) Surface
Parametric representations of a curve and a surface
446
CHAP. 10
Vector Integral Calculus. Integral Theorems
Similarly, for surfaces S in surface integrals, it will often be more practical to use a
parametric representation. Surfaces are two-dimensional. Hence we need two parameters,
which we call u and v. Thus a parametric representation of a surface S in space is of
the form
r(u. v)
(2)
[x(u, v), y(u, v), .-:(u, v)] = x(u, v)i
+
y(u, v)j
+ :::(u, v)k
where (u, v) varies in some region R of the uv-plane. This mapping (2) maps every point
(u, v) in R onto the point of S with position vector r(u, v). See Fig. 239B.
E X AMP L E 1
Parametric Representation of a Cylinder
The circular cylinder x 2 + y2 = a 2 , - ) ;a Z "" ), has radius a, height 2, and the ~-axis as axis. A parametric
representation is
(Fig. 240).
r(u. v) = [a cos 1/, asinu, vl = acosui + asinuj + vk
The components of r are x = a cos u, y = a sin u, Z = v. The parameters u, v vary in the rectangle
R: 0 ;a u "" 2'IT, -) ;a v ;a I in the uv-plane. The curves u = COllst are vertical straight lines. The curves
v = COllst are parallel circles. The point P in Fig. 240 corresponds to 1/ = 'lT13 = 60°, v = 0.7.
•
z
, p
-------q--
(v = 1)
\.
\.
,P
(v=O)
~
x
(v =-1)
Fig. 240.
E X AMP L E 2
Fig. 241.
Parametric representation
of a cylinder
Parametric representation
of a sphere
Parametric Representation of a Sphere
A sphere x 2 +
y2
+ ~2
=
a 2 can be represented in the form
r(u, v) = a cos v cos
(3)
II
i
+ a cos v sin 11 j + a sin v k
where the parameters u, v vary in the rectangle R in the uv-plane given by the inequalities 0 ;a u "" 2'IT.
- 'IT!2 ;a v ;a 'lT12. The components of r are
x = a cos V cos u.
y = a cos v sin u.
Z =
a sin v.
The curves u = COllst and v = COllst are the "meridians" and "parallels" on S (see Fig. 241). This represel1lalion
is used ill geography for measurillg the latitude and longitude of points 011 the globe.
Another parametric representation of the sphere also used in mathematics is
(3*)
where 0 ""
r(u, V) =
1/ ""
2'IT, 0 ""
V ""
'IT.
a cos 1/ sin v i + a sin u sin v j + a cos V k
•
SEC. 10.5
447
Surfaces for Surface Integrals
EX AMP L E 3
Parametric Representation of a Cone
A circular cone z
=
Yx
2
+
r(u,
in components x
2
Check that x +
i, 0 ~ t ~ H can be represented by
V) = [u
cos
V, u
sin V, u1 = u cos V i
+ u sin V j + uk,
= it cos V, y = u sin v,::: = u. The parameters vary in the rectangle R: 0 ~ u ~
)'2 = Z2, as it should be. What are the curves u = const and V = COllst?
H. 0
~ V :0;
2n.
•
Tangent Plane and Surface Normal
Recall from Sec. 9.7 that the tangent vectors of all the curves on a surface S through a
point P of S form a plane, called the tangent plane of S at P (Fig. 242). Exceptions are
points where S has an edge or a cusp (like a cone), so that S cannot have a tangent plane
at such a point. Furthermore, a vector perpendicular to the tangent plane is called a normal
vector of S at P.
Now since S can be given by r = r(u, v) in (2), the new idea is that we get a curve C
on S by taking a pair of differentiable functions
u
= u(t),
v
= v(t)
whose derivatives u' = dll/dt and v' = dv/dt are continuous. Then C has the position
vector i(t) = r(u(t), vet)). By differentiation and the use of the chain rule (Sec. 9.6) we
obtain a tangent vector of C on S
di
I
i (t) = dl
ar
-u
iJu
I
ar
av
+ -v
I
Hence the partial derivatives ru and rv at P are tangential to Sat P. We assume that they
are linearly independent, which geometrically means that the curves u = canst and
v = canst on S intersect at P at a nonzero angle. Then r u and r v span the tangent plane
of S at P. Hence their cross product gives a normal vector N of Sat P.
(4)
The corresponding unit normal vector n of S at P is (Fig. 242)
(5)
n=
I
lNI N =
1
Iru
X
rvl
n
s
Fig. 242. Tangent plane and normal vector
CHAP. 10
448
Vector Integral Calculus. Integral Theorems
Also, if S is represented by g(x, y, z) = 0, then, by Theorem 2 in Sec. 9.7,
I
n= - - - gradg.
Igrad gl
(5*)
A surface S is called a smooth surface if its surface normal depends continuously on
the points of S.
S is called piecewise smooth if it consists of finitely many smooth portions.
For instance, a sphere is smooth, and the surface of a cube is piecewise smooth
(explain!). We can now summarize our discussion as follows.
Tangent Plane and Surface Normal
THEOREM 1
If a suiface S is given by (2) with continuous ru = Br/rJu LInd rv = Br/CJv satisfying
(4) at every point of S, then S has at every point P a unique tangent plane passing
through P and spanned by ru and r v ' and a unique normal whose direction depends
continuously on the points of S. A /lonnal vector is given by (4) and the
corresponding unit /lonnal vector by (5). (See Fig. 242.)
E X AMP L E 4
Unit Normal Vector of a Sphere
From (5*) we find that the sphere g(x.
y.
n(x. y. z) =
z) = x 2
[
X
)'
a
a
-, -
+
y2
z]
. -
a
We see that n has the direction of the position vector [x,
must be the case?
E X AMP L E 5
+
Z2 -
x
= -
a
a2
i
0 ha~ the unit normal vector
,
+ -y
a
j
z
+ -
a
k.
y,
z] of the corresponding point. Is it obvious that this
•
= 0
in Example 3, the unit nonnal vector n becomes
Unit Normal Vector of a Cone
At the apex of the cone g(x. y, z) = -z +
undetermined because from (5*) we get
Vf
+
i
We are now ready to discuss surface integrals and their applications. beginning in the next
section .
--... ..-. ...
11-lOl
PREPARATION FOR SURFACE INTEGRALS:
PARAMETRIC REPRESENTATION,
NORMAL
Familiarize yourself with parametric representations of
important surfaces by deriving a representation (1), by finding
the parameter curves (curves II = eonst and u = eonst) of
the surface and a nonmal vector N = r ll x rv of the surface.
(Show the details of your work.)
1. xy-plane r(ll, u) = [u,
Probs. 2-10)
uJ (thus ui + uj; similarly in
2. xy-plane in polar coordinates
r(u, u) = [u cos u, u sin u] (thus u
=
3. Elliptic cylinder r(u, u) = [a cos u,
b sin u,
4. Paraboloid of revolution
r(u, u) = [u cos u, u sin u,
5. Cone r(u,
U)
= [au cos u,
r, u
=
0)
u]
u2]
au sin u.
6. Hyperbolic paraboloid
r(u, u) = [4u cosh u. u sinh u,
u2
eu]
J
7. Elliptic paraboloid r(u, u) = [3u cos u,
4u sin u,
u2]
SEC 10.6
8.
449
Surface Integrals
Helicoid r(u, v) = [1/ cos v,
v J. Explain the
u sin v,
20. (Representation z = f(x,y)) Show that z = f(x, y) or
g = z - flx, )'J = 0 can be written (fu = ilf/ilu. etc.)
nanl.e.
9. Ellipsoid
r(u. v) = [2 cos v cos u.
10. Ellipsoid
r(Lt, v) = [a cos v cos Lt,
3
co~
v sin Lt.
b cos v sin Lt,
4 sin vI
(6)
r(Lt, v) = [Lt,
feLt. v)j
v,
N = gradg =
and
I-f,,, -fv, 1].
c sin vI
21. (Orthogonal parameters) Show that the parameter
11. CAS
EXPERIMENT.
Graphing
Surfaces,
Dependence on a, b, c. Graph the surfaces in Probs.
1-10. In Probs. 6--9 generalize the surfaces by
introducing parameters a. b. c and then find out in
Probs. 3-10 how the shape of the surfaces depends on
a. h. c.
curves II = const and v = COllst on a surface r(lI, v)
are orthogonal (intersect at right angles) if and only if
rU.-rv = o.
22. (Condition (4)) Find the points in Probs. 2-7 at which
(4) N
0 does not hold and state whether this is owing
to the shape of the surface or to the choice of the
representation.
23. (Change of representation) Represent the paraboloid
in Proh. 4 so that N(O, 0)
0, and show N.
24. PROJECT. Tangent Planes T(P) will be less
important in our work, bur you should know how to
represent them.
ru
rv) = 0
(a) If S: rILl, V), then T(P): (r* - r
(a scalar triple product) or
r*(p. q) = rIP) + pr,,(P) + qrvCP).
lb) If S: g(x, y, z) = 0, then T(P): (r* - rtp) - vg = O.
*
112-191
DERIVATION OF PARAMETRIC
REPRESENTATIONS
Find a parametric representation and a normal vector. (The
answer gives one of them. There are many.)
+ y - 3z = 30
Plane 4x - 2y + 10.;: = 16
Sphere (x - 1)2 + (y + 2)2 + Z2 =
Sphere Ix + 2)2 + y2 + (z - 2)2 =
Elliptic paraboloid.;: = 4x2 + \.2
Parabolic cylinder z = 3)'2
*
12. Plane 5x
13.
14.
15.
16.
17.
18. Hyperbolic cylinder 9x 2
19. Elliptic cone z =
-
25
I
(c) If S: z = f(x, y), then
T(P): z* - z = (x* - x)fx(P)
4.\'2 = 36
+
(y* - y)fY(P)).
Interpret (a)-(c) geometrically. Give two examples for
(a), two for (b), and two for (c).
V9x2 + y2
10.6 Surface Integrals
To define a surface integral, we take a surface 5, given by a parametric representation as
just discussed,
(1)
r(u, v)
=
Ix(u, v), y(u, v), z(u, v)]
=
x(u, v)i
+ y(u,
v)j
+ z(u, v)k
where (u. v) varies over a region R in the uv-plane. We assume 5 to be piecewise smooth
(Sec. 10.5). so that 5 has a normal vector
(2)
and unit normal vector
n=
1
TNfN
at every point (except perhaps for some edges or cusps, as for a cube or cone). For a given
vector function F we can now define the surface integral over 5 by
(3)
JJF-n dA JJFCrCu, v))-N(u, v) du dv.
=
S
R
450
CHAP. 10
Vector Integral Calculus. Integral Theorems
Here N = INln by (2), and INI = Iru x rvl is the area of the parallelogram with :-ides ru
and r by the definition of cross product. Hence
L
"
n dA = n INI dll du = N dll du.
(3*)
And we see that dA = INI du du is the element of area of S.
Also F-n is the normal component of F. This integral arises naturally in flow problems,
where it gives the flux across S (= mass of fluid crossing S per unit time; see Sec. 9.8)
when F = pv. Here. p is the density of the fluid and v the velocity vector of the flow
(example below). We may thus call the surface integral (3) the flux integral.
We can write (3) in components, using F = [Fb F2 , F3]' N = [Nb N2 , N3]' and
n = [cos a, cos f3, cos '}']. Here, a, f3, I' are the angles between n and the coordinate axes;
indeed, for the angle between nand i, formula (4) in Sec. 9.2 gives cos a = noj/lnllil = noi,
and so on. We thus obtain from (3)
II
(4)
II
= II
Fon dA =
s
+
(FI cos a
F2 cos
f3 + F3 cos '}') dA
s
(FIN]
+
F2N2
+
F3 N 3) du du.
R
f3 dA = dz. dx.
In (4) we can write cos adA = dy dz, cos
becomes the following integral for the flux:
(5)
II
Fon dA
=
s
II
(F] dy dz.
+
F2 d::. dx
cos '}'dA
= dx dy. Then (4)
+ F3 dx dy).
s
We can use this formula to evaluate surface integrals by converting them to double integrals
over regions in the coordinate planes of the xy::.-coordinate system. But we must carefull)
take into account the orientation of S (the choice of n). We explain this for the integrals
of the F3 -terms,
II
(5')
F3 cos '}'dA
=
s
II
F3 dx dy.
s
If the surface S is given by z = hex, y) with (x, y) varying in a region R in the xv-plane,
and if S is oriented so that cos I' > 0, then (5') gives
(5")
II
S
F3 cos '}'dA = +
II
F3(x. y. hex. y) dxdy.
R
But if cos I' < 0. the integral on the right of (5") gets a minus sign in front. This follows
if we note that the element of area dx d..v in the xy-plane is the projection Icos '}'I dA of
the element of area dA of S; and we have cos I' = + Icos '}'I when cos I' > 0, but
cos I' = -leos '}'I when cos I' < O. Similarly for the other two terms in (5). At the same
time, this justifies the notations in (5).
Other forms of surface integrals will be discussed later in this section.
SEC. 10.6
451
Surface Integrals
E X AMP L E 1
Flux Through a Surface
2
Compute the flux of water through the parabolic cylinder S: y = x , 0 "'" X "'" 2, 0 "'" ::: "'" 3 (Fig. 243) if the
velocity vector is v = F = [3,=2, 6, 6x:::], speed being measured in meters/sec. (Generally, F = pv, but water
3
3
has the density p = I gmlcm = I ton/m .)
Fig. 243.
Solution.
Writing x
= [/
and
z = v,
Surface 5 in Example 1
we have
y = x2 =
S:
[/2.
Hence a representation of S is
[u, r-?, v]
r =
(0 "'" [/ "'" 2, 0 "'" v "'" 3).
By differentiation and by the definition of the cross product,
N
= r-u x
rv
=
[1.
2u.
x [0. O.
0]
II
ff
f
3
=
F'n dA
s
0
=
2
(6uv
2
f
=
- 6) du dv
0
=
3
(3[/2V
2
(12v 2 - 12) dv
=
(4v 3
-
0].
= 6uv 2
- 6. By
12
- 6[/)
dv
U~O
0
3
-1.
[2u.
2
[3v . 6. 6uv]. Hence F(S)' N
=
On S, writing simply F(S) for F[r(u. v)], we have F(S)
integration we thus get from (3) the flux
I]
12v)
o
13
=
10~
- 36
=
72
[m3 /sec]
v~O
or 72 000 liters/sec. Note that the y-component of F is positive (equal to 6), so that in Fig. 243 the flow goes
from left to right.
Let us confirm this result by (5). Since
N
=
INln
=
INllcos a,
cos {3,
cos 'YI
= l2u, -I, 0] =
I,
[2x,
0]
we see that cos a > 0, cos (3 < 0, and cos 'Y ~ O. Hence the second term of (5) on the right gets a minus sign,
and the last term is absent. This gives. in agreement with the previous result.
IIF'
s
E X AMP L E 2
ff
34
n dA
=
00
ff
23
3,=2 dy dz -
6 dz ll\-
00
=
f
3
0
2
4(3z ) dz -
f
2
6· 3 dx
=
4' 3
3
6' 3 . 2
-
=
72. •
0
Surface Integral
Evaluate (3) when F
(Fig. 244).
[x 2, 0, 3y2] and S is the portion of the plane x
+
x
Fig. 244.
Portion of a plane in Example 2
y
+ z
1
In
the first octant
452
CHAP. 10
Vector Integral Calculus. Integral Theorems
Solution. Writing x = " and y = v. we have z = I - x - y = I - II - v. Hence we can represent the plane
x + y + Z = I in the form r(u. v) = [II, V. I - u - v]. We obtain the first-octdnt portion S of this plane by restricting
= II and v = v to the projection R of S in the xy-plane. R is the triangle bounded by the two coordinate axes and
the straight line x +}' = I, obtained from x + y + Z = I by setting z = O. Thus 0;;;; x;;;; I - y. 0;;;;)';;;; I.
By inspection or by differentiation.
x
N
Hence F(S)oN =
fl?
II
FondA
S
= ru
X
rv = [I.
o.
=
-11 x [0. I, -I]
[I. 1, II.
O. 3v2]o[l. I. I] = u 2 + 3v 2 . By (3).
=
JIc
1
u2 + 3v 2)dlldv
=
ff
I-v
2
(u + 3V2)dudv
ROO
L[+
1
=
(I - V)3
2
+ 3v (l -
V)] dv =
+
•
Orientation of Surfaces
From (3) or (4) we see that the value of the integral depends on the choice of the unit
normal vector D. (Instead of D we could choose -D.) We express this by saying that such
an integral is an integral over an oriented surface S, that is, over a surface S on which
we have chosen one of the two possible unit normal vectors in a continuous fashion. (For
a piecewise smooth surface. this needs some further discussion, which we give below.)
If we change the orientation of S, this means that we replace 0 with -D. Then each
component of 0 in (4) is multiplied by -I, so that we have
THEOREM 1
Change of Orientation in a Surface Integral
The replacement ofn by - 0 (hence ofN by -N) corresponds to the lI1u/tipli("(ltion
of the integral in (3) or (4) by -1.
How do we effect such a change of N in practice if S is given in the form (l)? The
simplest way is to interchange u and v, because then ru becomes rv and conversely, so
that N = ru X rv becomes rv X ru = -ru X rv = -N, as wanted. Let us illustrate this.
E X AMP L E 3
Change of Orientation in a Surface Integral
In Example I we now repre~ent S by
r
=
[v. v2 • 11],0;;;; V
N = ru x r" =
;;;;
2. 0 ;;;;
II ;;;;
[0,0, I] x [I, 2v, 0] = [-2v, 1,0].
2
For F = [3z • 6. 6x.::] we now get FCS) = [3u 2, 6. 6uv]. Hence FCS)
the old result times - I,
II
3. Then
0
N=
-6u 2v + 6 and integration gives
3 2 3
F(S)'Ndvdu =
ff
2
C-6u v + 6)dvd" =
ROO
f
2
(-1211 + 12)du = -72.
•
0
Orientation of Smooth Surfaces
A smooth surface S (see Sec. 10.5) is called orientable if the positive normal direction,
when given at an arbitrary point Po of S, can be continued in a unique and continuous
way to the entire surface. For smooth surfaces occurring in applications this is always
true.
SEC. 10.6
Surface Integrals
453
n
s
c
s
c
i'
I
L
(a) Smooth surface
(b) Piecewise smooth surface
Fig. 245.
Orientation of a surface
Orientation of Piecewise Smooth Surfaces
Here the following idea will do it. For a smooth orientable smface S with boundary curve
C we may associate with each of the two possible orientations of S an orientation of C,
as shown in Fig. 245a. Then a piecewise smooth surface is called orientable if we can
orient each smooth piece of S so that along each curve C* which is a common boundary
of two pieces Sl and S2 the positive direction of C* relative to Sl is opposite to the direction
of C* relative to S2' See Fig. 245b for two adjacent pieces; note the arrows along C*.
Theory: Nonorientable Surfaces
A sufficiently small piece of a smooth swface is always orientable. This may not hold for
entire surfaces. A well-known example is the Mobius strip 5, shown in Fig. 246. To make
a model, take the rectangular paper in Fig. 246. make a half-twist, and join the short sides
together so that A goes onto A, and B onto B. At Po take a normal vector pointing, say.
to the left. Displace it along C to the right (in the lower part of the figure) around the strip
until you return to Po and see that you get a normal vector pointing to the right, opposite
to the given one. See also Prob. 21.
B
I
A
A
c
Po
1
B
Fig. 246.
Mobius strip
5AUGUST FERDINAND MOBIUS (1790-1868). German mathemallcian, srudeI1l of Gauss, known for his
work in surface theory, geometry, and complex analysis (see Sec. 17.2).
454
CHAP. 10
Vector Integral Calculus. Integral Theorems
Surface Integrals Without Regard to Orientation
Another type of surface integral is
f f G(r) dA = f f G(r(u, v))IN(u, v)1 du £Iv.
(6)
S
R
Here dA = INI du dv = Iru x rei du dv is the element of area of the surface S represented
by (1) and we disregard the orientation.
We shall need later (in Sec. 10.9) the mean value theorem for surface integrals, which
state~ that if R in (6) is simply connected (see Sec. 10.2) and G(r) is continuous in a
domain containing R, then there is a point (uo, vo) in R such that
ff
(7)
G(r) dA = G(r(uo, vo»)A
(A
= Area of S).
S
G
As for applications, if G(r) is the mass density of S, then (6) is the total ma'>s of S. If
= 1, then (6) gives the area A(S) of S,
(8)
= f fdA = f fir" x rvl du dv.
A(S)
S
R
Examples 4 and 5 show how to apply (8) to a sphere and a torus. The final example,
Example 6. explains how to calculate moments of inertia for a surface.
E X AMP L E 4
Area of a Sphere
For a sphere r(lI. v) = [a cos v cos II, a cos v sin II,
in Sec. 10.51 we obtain by direct calculation (verify!)
a
a
Using cus2
l/
2
sin v),
cos
2
0
~ u ~
.
27T,
{l2
V Sin Ii.
-7T/2 ~ v ~ 7T/2,
[see (3)
cos V sin uJ.
+ sin2 l/ = I and then cos2 v + sm2 V = 1. we obtain
With this, (8) gives the familiar formula (note that leos vi = cos v when -7T/2 ~ v ~ 7T/2)
f f
11"12
A(S)
= a2
-r./2
E X AMP L E 5
211"
Icos vi dll dv
= 27Ta 2
0
f
11"12
cos v dv
= 47Ta 2.
-.".12
•
Torus Surface (Doughnut Surface): Representation and Area
A torus swface S is obtained by rotating a circle C about a straight line L in space so that C does not intersect
or touch L but its plane always passes through L. If L is the ~-axis and C has radius b and its center has distance
a (> b) from L, as in Fig. 247, then S can be represented by
r(lI. v) = (a
where 0 :;:: u :;:: 27T. 0 :;::
V ~
fu
(a
+ b cos v) sin uj + b sin v k
27T. Thus
= -(a
fu =
ru X ru
+ b cos v) cos II i +
+ bcosv)sinui + (a + hcosv)cosl/j
-bSinVCOSlli - bsinvsinllj
~ b(lI
+
bcosV)(CO~llcosvi
+ bcosvk
+ sin llCOS vj + sinvk).
SEC. 10.6
Surface Integrals
455
Hence Iru x [vi
=
b(a
+ b cos v). and (8) gives
JJ
27T
(9)
A(S)
=
o
the total area of the torus.
27T
b(a
0
•
+ b cos v) dll dv = 4~ab.
z
c
y
1
1
~b~
1
1
1
1
1
1
I:
1
1
1
4-ja~
1
1
1
1
1
1\1
1.1
1
1
I
y
x
Fig. 247.
E X AMP L E 6
Torus in Example 5
Moment of Inertia of a Surface
Find the moment of inertia / of a spherical lamina S: x 2
mass M about the .:-axis.
+
)'2
+
:2 =
a 2 of constant mass density and total
Solutioll.
If a mass is distributed over a surface S and fLeX, y, ;:) is the density of the mass (= mass per unit
area), then the moment of inertia I of the mass with respect to a given axis L is defined by the surface integral
(10)
/=
II fLD
2
dA
s
where D(x. y. z) is the distance of the point lx. y. ;:) from L. Since. in the present example. fL is constant and S
has the area A = -I-7Ta 2 , we have fL = MIA = MI(47Tc?).
For S we use the same representation as in Example 4. Then D2 = x 2 + y2 = a 2 cos2 v. Also, as in that example,
dA = a 2 cos v dll dv. This gives the following result. [Tn the integration, use cos 3 v = cos v (1 - sin2 v).l
/=
2
II
S
fLD dA =
M
~
47Ta
Representations z = f(x,y).
= y, r = [l/, v, f] gives
I"'/2
J27ra4cos3vdlldv =
-",/2 0
M~'
2I"'/2 cos3 vdv
--../2
?Ma 2
=
3
If a surface S is given by z = f(x, y), then setting l/
V
INI = Iru x rvl = 1[I,O.f,,] x [0, 1,fv]1 = I[-fw -fv, 1]1
and, since fu
(11)
= fx, fv = fy, formula (6) becomes
IIc(r)dA= I IC(X,Y,f(X,y)Jl
S
R*
+
(ilf)2
ax
+ ( -of )2 dxdy.
ay
•
= x,
CHAP. 10
456
Vector Integral Calculus. Integral Theorems
R*
y
Fig. 248.
Formula (11)
Here R* is the projection of S into the _\y-plane (Fig. 248) and the normal vector N on S
points up. If it points down, the integral on the right is preceded by a minus sign.
From (11) with G = 1 we obtain for the area A(S) of S: z = j(x, y) the formula
(12)
A(S)
~
UJI
+
(:~)' + (~;.)' dxdJ
where K:' is the projection of S into the xy-plane, as before.
11-121
FLUX INTEGRALS (3)
f
Fon dA
5
Evaluate these integrals for the following data. Indicate the
kind of surface. (Show the details of your work.)
1. F = [2x,
5-",
s: r
0].
4u
= [u,
v,
~
0, z
+
3v].
0~1I~1,-8~v~8
2. F =
S: x
[x 2, y2,
+y +Z=
3. F
[x - z.
=
s: r
,:2].
~
4, x
y -
= [u cos V,
0, y
x. Z - y].
1/ sin v, u], 0
~
1]4-20 1
0
~ II ~
3. 0
~
v
~ 71"
4. F = leY, -e Z , eX],
s: x 2 + .1'2 = 9, x ~ O. -" ;;:; O. 0 ~ z ~ 2
5. F
=
y,
[x,
z], S: r
o ~ II ~ 4,
= [1/ cos v,
7. F
=
[1,
8. F
=
[tanxy.
10. F
S:
1,
= [y2,
z=
sin v.
2
11 ],
x 2y.
-z], S: y2
= I, 1 ~ x ~ 4
0],
+
x 2,
Z2 =
a 2, x ~ 0,
y ;;:;
0,
Z ~
0
Z4],
4Y~ + )2, 0 ~ z ~ 8, Y ~ 0
x 3,
11. F = [y3,
S: x 2 + 4)'2
12. F = [coshy,
=
Z3],
4, x ~ 0, y ~ O. 0 ~ z ~ h
0,
sinh x],
~z=x+~O~y~~O~x~1
G(r) dA
Evaluate these integrals for the following data. Indicate the
kind of surface. (Show the details.)
+
sinx,
~x+y+z=~x~~y;;:;~z~O
15. G = 5(x
S: z = x
+ )' + z),
+ 2y, 0
~
y
~
x, 0
~
x
~
2
+ e',
16. G = .vex +
S: x 2 + y2 = 16, Y ~ 0, 0 ~
0
+ ~::2
IJ
xeY
1], S the sphere of radius 1 and center 0
x,
y2
Z ~
SURFACE INTEGRALS (6)
14. G = cosy
-71" ~ V ~ 71"
6. F = [cosh yz. O. )'4].
S: -,,2 + Z2 = 1. 0 ~ x ~ 20.
9. F = [0,
s: x 2 +
II
13. CAS EXPERIMENT. Write a program for evaluating
sUiface integrals (3) that prints intennediate results
(F, F ° N, the integral over one of the two variables).
Can you experimentally obtain rules on functions and
surfaces giving integrals that can be evaluated by the
usual methods of calculus? Make a list of positive and
negative results.
17. G = (x
2
+ )'2 +
z2f,
S:
Z ~ 4
Z = Vx 2 + y2, y
~ 0,
0~z~2
18. G = ax + by + cz, S: x 2 + y2 + Z2 = I, y ~ O. z ~ 0
19. G = arctan (v/x),
S: z = x 2 + ),2, l ~ z ~ 9, x ~ 0, y ~ 0
20. G
=
3x),. S: z = xy. 0
~
x
~
1. 0
~
y
~
1
21. (Fun with Mobius) Make Mobius strips from long
slim rectangles R of grid paper (graph paper) by pasting
the short sides together after giving the paper a halftwist. In each case count the number of parts obtained
SEC. 10.6
Surface Integrals
457
by cutting along lines parallel to the edge. (a) Make R
three squares wide and cut until you reach the
beginning. (b) Make R four squares wide. Begin cutting
one square away from the edge until you reach the
beginning. Then cut the portion that is still two squares
wide. (c) Make R five squares wide and cut similarly.
(d) Make R six squares wide and cut. Formulate a
conjecture about the number of parts obtained.
(13)
ds 2 = E du 2
+
2F du dv
+
G dv 2
with coefficients
22. (Center of gravity) Justity the following formulas for
is called the first fundamental form of S. (E, F, G are
standard notations that have nothing to do with F and
G that occur at some other places in this chapter.) The
first fundamental form is basic in the theory of surfaces,
since with its help we can determine lengths, angles,
and areas on S. To show this, prove the following.
the mass M and the center of gravity (x, y, Z) of a lamina
S of density (mass per unit area) u(x, y, z) in space:
(a) For a curve C: u = u(t), v = vet), a ~ t ~ b, on
S, formulas (10), Sec. 9.5, and (14) give the length
APPLICATIONS
II
:v = ~ II
M=
x=
udA.
s
~ If.rudA ,
s
Z=
yudA,
~
s
II
I =
=
for the moments of inertia of the lamina in Prob. 22
about the x-, y-. and ::;-axes. respectively:
v'r'(t).r'(t) dt
b
YEu'2
Iz =
II
(x
2
+
y2)udA.
24. Find a fonnula for the moment of inertia of the lamina
in Prob. 22 about the line y = x, ::; = O.
Find the moment of inertia of a lamina S of density 1
about an axis A, where
x2 + y2
=
I,
26. S as in Prob. 25.
2
27. S: x +
y2 =
Z2,
0 ~
+
Gv'2dt.
cos 'Y =
s
s
25. S:
2Fu'v'
(b) The angle 'Y between two intersecting curves
C r : u = gUY, v = h(t) and C 2 : u = p(t), v = q(t) on
S: r(u, v) is obtained from
(16)
s
+
a
s
23. (Moments of inertia) Justify the following formulas
b
a
(15)
zudA.
I
I
where a = rug' + rvh' and b
tangent vectors of C r and C 2 .
(c) The square of the length of the normal vector N
can be written
so that formula (8) for the area A(S) of S becomes
z ~ h,
A: the z-axis
A: the line::: = h/2 in the xc-plane
0 ~ Z ~ h,
A: the z-axis
28. (Steiner's theorem6 ) If IA is the moment of inertia of
a mass distribution of total mass M with respect to an
axis A through the center of gravity, show that its
moment of inertia IB with respect to an axis E, which
is parallel to A and has the distance k from it. is
= rup' + rvq' are
A(S)
=
II I I INI
II Y
dA
=
S
(18)
=
du dv
R
EG - F2 du dv.
R
(d) For polar coordinates u (= r) and v (= 8) defined
by x = u cos v, y = u sin v we have E = 1, F = O.
G = u 2 , so that
ds 2 = du 2
+
u 2 dv 2
=
dr 2
+
r2 d8 2.
29. Using Steiner's theorem, find the moment of inertia of
S in Prob. 26 about the x-axis.
Calculate from this and (18) the area of a disk of
radius a.
30. TEAM PROJECT. First Fundamental Form of a
Surface. Given a surface S: r(lI, v), the corresponding
quadratic differential fonn
(e) Find the tirst fundamental fOlm of the torus in
Example 5. Use it to calculate the area A of the torus.
Show that A call also be obtained by the theorem of
6JACOB STEINER (1796-1863), Swiss geometer, born in a small village, learned to write only at age 14,
became a pupil of Pestalozzi at 18. later studied at Heidelberg and Berlin and, finally, because of his outstanding
research, was appointed professor at Berlin University.
458
CHAP. 10
Vector Integral Calculus. Integral Theorems
Calculate the first fundamental form for the usual
representations of important surfaces of your own
choice (cylinder, cone, etc.) and apply them to the
calculation of length~ and areas on these ~urfaces.
Pappos,7 which states that the area of a surface of
revolution equals the product of the length of a
meridian C and the length of the path of the center of
gravity of C when C is rotated through the angle In.
10.7
(I)
Triple Integrals.
Divergence Theorem of Gauss
In this section we discuss another "big" integral theorem, the divergence theorem, which
transforms surface integrals into triple integrals. So let us begin with a review of the latter.
A triple integral is an integral of a function f(x, y, z) taken over a closed bounded
(three-dimensional) region T in space (where "clo!o.ed" and "bounded" are defined as in
footnote 2 of Sec. 10.3, with "sphere" substituted for "circle"). We subdivide T by planes
parallel to the coordinate planes. Then we consider those boxes of the subdivision
(rectangular parallelepipeds) that lie entirely inside T, and number them from I to n. In
each such box we choose an arbitrary point, say, tXk, Yk, z,.J in box k. The volume of box
k we denote by Ll V k . We now form the sum
n
in
2:
=
f(xk, Yk, Zk) .1 Vk ·
k~l
This we do for larger and larger positive integers 11 arbitrarily but so that the maximum
length of all the edges of those 11 boxes approaches zero as II approaches infinity. This
gives a sequence of real numbers in}' Jn2 , . . . . We assume that f(x, Y, z) is continuous in
a domain containing T, and T is bounded by finitely many smooth sU/jaces (see Sec. 10.5).
Then it can be shown (see Ref. [GR4] in App. I) that the sequence converges to a limit
that is independent of the choice of subdivisions and corresponding points (Xk, Yk, Zk)' This
limit is called the triple integral of f(x, y, .;:) orer the region T and is denoted by
III
f(x, y, z) dx dy d.;:
or by
T
III
f(x, y, .;:) dV.
T
Triple integrals can be evaluated by three successive integrations. This is similar to the
evaluation of double integrals by two successive integrations, as discussed in Sec. LO.3.
An example is shown below (Example 1).
Divergence Theorem of Gauss
Triple integrals can be transformed into surface integrals over the boundary surface of a
region in space and conversely. Such a transformation is of practical interest because one
of the two kinds of integral is often simpler than the other. It also helps in establishing
fundamental equations in fluid flow, heat conduction, etc .. as we shaH see. The
transformation is done by the divergence theorem. which involves the divergence of a
vector function F = [FI, F 2 , F 3 ] = F1i + F 2 j + F3k, namely,
7PAPPUS OF ALEXANDRIA (about A.D. 300), Greek mathematician. The theorem is also called Guldin's
theorem. HABAKUK GULDIN (1577-1643) was born in St. Gallen, Switzerland. and later became professor
in Graz and Vienna.
SEC 10.7
459
Triple Integrals. Divergence Theorem of Gauss
(Sec. 9.8).
(1)
Divergence Theorem of Gauss
(Transformation Between Triple and Surface Integrals)
THEOREM 1
Let T be a closed bounded region in space whose bOl/ndary is a piecewise smooth
orielltable sll1face S. Let F(x, y, ;:) be a vector function that is cOlltinllous and has
continllolls first partial derivatives ill some domain containing T. Then
f f f div F dV = f f Fen dA.
(2)
s
T
= [Fl , F 2 , Fg] and of the (Juter unit 1101711al vector
In components ofF
n = [cos a, cos f3,
cos y] of S (as in Fig. 250), formula (2) becomes
f f f(
T
iJ~2 +
aFI +
ax
rJy
=
(2*)
fI
g
iJF
az
(FI cos a
)
+
eLl: d y d::.
F2 cos
f3 + Fg cos
y) dA
s
f f(F
=
1
dy dz + F2 dzdx + Fg dxdy).
s
The proof follows after Example 1. "Closed bounded region'" is explained above.
"piecewise smooth orientable" in Sec. I 0.5, and "domain containing T" in footnote 4,
Sec. 10.4, for the two-dimensional case.
EXAMPLE
Evaluation of a Surface Integral by the Divergence Theorem
Before we prove the theorem. let us show a typical application. Evaluate
z
b
I
I
I
ff
I=
3
2
(x dy dz + x 2y d: ell: + x : dt dy)
S
_----T----,
where S is the closed surface in Fig. 249 consisting of the cylinder x 2 +
disks::: = 0 and: = b (x 2 + y2 ~ a 2 ).
x
Solution.
- -....l.
Fig. 249. Surface 5
in Example 1
a (0 ~ ~ ~ b) and the circular
2
Hence div F = 3x2 + x2 + x 2 = 5x 2 . The form of the surface
suggests that we introduce polar coordinates r. edefined by x = r cos e. y = r sin e fthu~ cylindrical coordinates
e. :). Then the volume element is
=
and we obtain
F1
= x 3 • F2 = x 2 )".
)"2 =
r.
F3
=
.\"2:;:.
dl: dy dz rdr dl:J d:.
b
/=
fffSX2dXdYdZ= f
T
Z~O
f
27T
6~O
a
f (Sr cos e)rdrdl:Jdz
2
2
T~O
•
460
CHAP. 10
PROOF
Vector Integral Calculus. Integral Theorems
We prove the divergence theorem, beginning with the first equation in (2*). This equation
is true if and only if the integrals of each component on both sides are equal; that is,
(3)
IIIo.F1 dxdydz= IIF1 cos adA,
T
ox
s
(4)
III
T
(5)
III
T
O~2
f3 dA,
dx dy dz = I I F2 cos
s
0)
aF3
dxdydz = II F3cosydA.
s
7
o~
We first prove (5) for a special region T that is bounded by a piecewise smooth
orientable surface S and ha~ the property that any straight line parallel to anyone of the
coordinate axes and intersecting T has at most aile segment (or a single point) in common
with T. This implies that T can be represemed in the form
(6)
g(x, y)
~
z
~
/z(x, y)
where (x, y) varies in the orthogonal projection R of T in the xy-plane. Clearly,
R(X, y) represents the "bottom" S2 of S (Fig. 250), whereas z = hex, y) represents the
"top" Sl of S, and there may be a remaining vertical portion S3 of S. (The portion S3 may
degenerate into a curve, as for a sphere.)
To prove (5), we use (6). Since F is continuously differentiable in some domain
containing T, we have
z=
.
II1 --:-::- dxdydz
of
(7)
0<0
T
[
= I I
R
h<x,
y)
I
g(x, y)
"JF
~ dz
]
OZ
dxdy.
Integration of the inner integral [ ... ] gives F3 [x, y, hex, y)] - F 3 [x, y, glx, y)]. Hence the
triple integral in (7) equals
(8)
I I F3[x, y, hex, y)] dx dy - I I F3[x, y, g(x, y)] dx dy.
n'11
~/Sl
z
t
Y~~
,
1
fJ
n
1
1I~I1
x
Fig. 250.
11
11_ _
1
1
1
Example of a special region
Y
SEC. 10.7
Triple Integrals. Divergence Theorem of Gauss
461
But the same result is also obtained by evaluating the right side of (5); that is [see also
the last line of (2*)],
JJFgcOS ydA = JJFgdxdy
s
s
= +
JJ Fg[x, y, hex, y)] dx dy - JJ Fg[x, y. g(x, y)] dx dy,
where the first integral over R gets a plus sign because cos y> 0 on S1 in Fig. 250 [as
in (5"), Sec. 10.6], and the second integral gets a minus sign because cos y < 0 on S2'
This proves (5).
The relations (3) and (4) now follow by merely relabeling the variables and using the
fact that, by assumption, T has representations similar to (6). namely,
g(.\', z) ~ x ~
hey,
g(z, x) ~ y ~ h(z, x).
and
z)
This proves the first equation in (2*) for special regions. It implies (2) because the left side
of (2*) is just the definition of the divergence, and the right sides of (2) and of the first
equation in (2*) are equal, as was shown in the first line of (4) in the last section. Finally,
equality of the right sides of (2) and (2*), last line, is seen from (5) in the last section.
ThIS establishes the divergence theorem for special regions.
For any region T that can be subdivided into finitely many special regions by means of
auxiliary surfaces. the theorem follows by adding the result for each part separately; this
procedure is analogous to that in the proof of Green's theorem in Sec. LO.4. The sUlface
integrals over the auxiliary surfaces cancel in pairs, and the sum of the remaining surface
integrals is the surface integral over the whole boundary surface S of T; the triple integrab
over the parts of T add up to the triple integral over T.
The divergence theorem is now proved for any bounded region that is of interest in
practical problems. The extension to a most general region T of the type indicated in the
theorem would require a certain limit process: this is similar to the situation in the case
of Green's theorem in Sec. 10.4.
•
E X AMP L E 2
Verification of the Divergence Theorem
Evaluate
JJ(7xi - ;:k)"n
dA over the sphere S: x 2 +
y2
+
Z2
= 4
(a) by (2).
(b) directly.
s
3
= div [7x, 0, -z] = div [7xi - zk] = 7 - J = 6. Answer: 6' (4J3)1T' 2 = 641T.
(b) We can represent S by (3). Sec. 10.5 (with a = 2). and we shall use n dA = N dlt dv [see (3*), Sec. 10.61.
Accordingly,
Solution. (a) div F
S:
Then
r = [2 cos v cos
ru= l-2co$vsinll.
rv
N
Now on S we have x
= r ll
x r,.
=
2 cos v sin II.
2cosvcosu,
[-2sinvcoslI, - 2sinvsinu.
= [4 cos
2
v cos
II,
2
4 cos v sin II.
= 2 cos v cos II, Z = 2 sin v, so that F = [7x.
F(S)
and
It.
F(S)"N
= [14cosvco~lI.
O.
2 sin v].
0]
2 cos v]
4 cos v sin vJ.
O. -;oj becomes on S
-2 sin v]
=
(\4 cos v cos 1I)'4cos2 v cos II
=
56 cos3 v cos 2 u - 8 cos v sin 2 v.
+ (-2 sin v)'4 cos v sin v
462
CHAP. 10
Vector Integral Calculus. Integral Theorems
On S we have to integrate over
II
from 0 to 277". This gives
77"' 56 cos 3 v - 277"' 8 cos v sin2 v.
The integral of cos v sin2 v equal~ (sin3 v)/3, and that of cos 3 v = cos v (I - sin2 v) equals sin v - (sin 3 v)13.
On S we have -77"/2 ;'" v ;'" 77"12, so that by sub~tituting these limits we get
5677"(2 - 2/3) - ) 677"' 2/3 = 6477"
a~
•
hoped for. To see the point of Gauss's theorem, compare the amounts of work.
Coordinate Invariance of the Divergence. The divergence (I) is defined in terms of
coordinates, but we can use the divergence theorem to show that div F has a meaning
independent of coordinates.
For this purpose we first note that triple intgrals have properties l/uite similar to thuse
of double integrals in Sec. 10.3. In particular. the mean value theorem for triple integrals
asserts that for any continuous function f(x, y, .::) in a bounded and simply connected
region T there is a point Q: (xo, )'0, <'0) in T such that
III
(9)
f(x, y, z)
dV
= f(xo, Yo, '::0)V(T)
(V(l)
= volume of T).
T
In this formula we interchange the two sides, divide by veT), and set f = div F. Then by
(he divergence theorem we obtain for the divergence an integral over the boundary surface
SeT) of T,
(10)
div F(xo, Yo,
'::0)
= _1_
veT)
III div F dV = _1_ sm
I IFon dA.
veT)
T
We now choose a point P: (Xl> ,vI' ZI) in T and let T shrink down onto P so that the
maximum distance den of the points of T from P goes to zero. Then Q: (xo. )'0' 20) must
approach P. Hence (10) becomes
(11)
divF(P)=lim
d(T)->O
-I-IIFOUdA.
VeT)
SeT)
This proves
THEOREM 2
Invariance of the Divergence
The divergence of a vector function F with cOlltinuolls first partial derivatives in a
region T is independent of the particular choice of Cartesian coordinates. For any
Pin T it is given by (II).
Equation (l1) is sometimes used as a definition of the divergence. Then the representation
(1) in Cartesian coordinates can be derived from (11).
Further applications of the divergence theorem follow in the problem set and in the
next section. The examples in the next section will also shed further light on the nature
of the divergence.
SEC. 10.8
Further Applications of the Divergence Theorem
.------
C .. C_i, _
•••
APPLICATION OF TRIPLE INTEGRALS:
MASS DISTRIBUTION
11-81
14. The paraboloid
Find the total mass of a mass distribution of density u in a
region T in space. (Show the details of your work.)
1. u
2. u
Ixl
kl
x2y2~2, T the box
~ a, iYI ~ b,
~ c
2
= x + y2 + ~2, T the box 0 ~ x ~ 4, 0 ~ J ~ 9,
=
O~:::~l
=
5. u
6. u
= ~(X2
7. u
=
+
y2)2,
T the cylinder x 2 +
)'2 ~
4.
Izl ~ 2
= 30::;. T the region in the first octant bounded by
y = 1 - x 2 and z = x. Sketch it.
+ Y + ~2,
1
T the cylinder
,,2
+ ::.2
~ 9,
l~x~9
8. u = x 2
APPLICATION OF TRIPLE INTEGRALS:
MOMENT OF INERTIA
19-141
Ix
=
+ y2. T the ball x 2 + y2 + :;2 ~ a 2
JJJ
(y2
+ z2) dx dy dz of a mass of density
1 in
T
a region T about the x-axis. Find Ix when T is as follows.
9.
10.
11.
12.
13.
The cube 0
~
x
~
a, 0
~ y ~
a, 0
~
z
~
The box 0 ~ x ~ 1I, -bt2 ~)' ~ bt2. -el2
The cylinder y2 + :;2 ~ c?, 0 ~ x ~ Iz
+ .\'2 + ~2 ~ a 2
The cone y2 + ~2 ~ x 2 • 0 ~ x
The ball x 2
10.8
~
Iz
a
~ ~ ~
el2
y2
+ Z2
~ X, 0 ~ x ~ h
1 (h
15. Show that for a solid of revolution, f" = 27T L r\x) dx.
U~t: this to solve Probs. 11-14.
0
16. Why is Tx in Prob. 13 for large h larger than I,,, in Prob.
14? Why is it smaller for h = I? Give physical reason.
117-251
sin x cos y, T: 0 ~ x ~ ~7T. ~7T - X ~ Y ~ ~7T,
o ~ ~ ~ 12
4. u = e-"'-Y-z, Tthe tetrahedron with vertices (0. O. 0).
(2, O. a), (0, 2. a). (0, 0, 2)
3. u
463
APPLICATION OF THE DIVERGENCE
THEOREM:
F· n dA
SURFACE INTEGRALS
s
JJ
Evaluate this integral by the divergence theorem. (Show the
details.)
;::], S the sphere x 2
+ y2 + ::;2 = 9
18. F = [4x, 3z, 5y]. S the surface of the cone
x 2 + y2 ~ :;2. 0 ~ ~ ~ 2
19. F = [z - y y3, 2;::3]. S the surface of y2 + Z2 ~ 4,
-3 ~x~ 3
20. F = [3x)' 2, yx 2 - y3, 3zx2], S the surface of
2
t
+ y2 ~ 25. 0 ~ Z ~ 2
21. F = [sin y, cos x. cos ;::], S tile surface of
x 2 + )'2 ~ 4. Izl ~ 2
22. F = [x 3 - .\'3, )'3 - Z3, ;::3 - x3 ]. S the surface of
x 2 + y2 + ;::2 ~ 25, z :0=: 0
23. F = [4x 2 • 2x + y2. x 2 + Z2]. S the surface of the
tetrahedron in Prob. 4
24. F = [4x 2 • y2. -2 cos m:], S the surface of the
tetrahedron with vertices (0, O. a). (l. O. a). (0. l. m.
(0,0, 1)
25. F = [5x 3 . 5y 3, 5;::3], S: x 2 + y2 + Z2 = 4
17. F = [x,
y,
Further Applications of the
Divergence Theorem
We show in this section that the divergence theorem has basic applications in fluid flow,
where it helps characterize sources and sinks of fluid, in /zeat flo~r, where it leads to the
basic heat equation, and in potential theory, where it gives basic properties of the solutions
of Laplace's equation. Here the region T and its boundary surface S are assumed to be
such that the divergence theorem applies.
E X AMP L E 1
Fluid Flow. Physical Interpretation of the Divergence
From the divergence theorem wc may obtain an intuitive interpretation of the divergence of a vector. For this
purpose we consider the flow of an incompressible fluid (see Sec. 9.8) of constant density p = I which is steady.
that is. does not vary with time. Such a flow is determined by the field of its velocity vector yep} at any
poimP.
464
CHAP. 10
Vector Integral Calculus. Integral Theorems
Let S be the boundary surface of a region T in space, and let n be the outer unit normal vector of S. Then
von is the normal component of v in the direction of n, and Ivon dAI is the mass of fluid leaving T (if von> 0
at some P) or enterillg T (if von < 0 at P) per unit time at some point P of S through a small portion 6.S of S
of area 6.A. Hence the total mass of fluid that flows across S from T to the outside per unit time is given by the
surface integral
II
vondA.
s
Division by the volume Vof T give, the average flow out of T:
(1)
Since the flow is steady and the fluid is incompressible. the amount of fluid flowing outward must be continuously
supplied. Hence. if the value of the integral (I) is different from zero, there must be sources (positive sources
and negat;,'e sources. called sinks) in T. that is, points where fluid is produced or disappears.
If we let T shrink down to a fixed point P in T, we obtain from (I) the source intensity at P given by the
right side of (11) in the last section with F n replaced by von, that is,
0
(2)
div vlP)
=d(T)~O
lim
_1_
V(1)
IIvon
dA.
SeT)
Hence the dil'erge1lce of the "e/ocity ,'ector v of a steady incompressible floll' is the source intensit--.- of the flow
at the correJoponding point.
There are no sources in T if and only if div v is zero everywhere in T. Then for any closed surface S in T we
have
I IvondA
=
o.
s
E X AMP L E 2
•
Modeling of Heat Flow. Heat or Diffusion Equation
Physical experiments show that in a body, heat flows in the direction of decreasing temperature, and the rate of
flow is proportional to the gradient of the temperature. This means that the velocity v of the heat flow in a body
is of the form
(3)
v
= -Kgrad V
where V(x, y, z. t) is temperature, t is time. and K is called the thermal conductil'ity of the body: in ordinary
physical circumstances K is a constant. Using this information, set up the mathematical model of heat flow, the
so-called heat equation or diffusion equation.
Solution.
Let T be a region in the body bounded by a surface S with outer unit normal vector n such that
the divergence theorem applies. Then von is the component of v in the direction of n. and the amount of heat
leaving T per unit time is
IIvondA.
s
This expression is obtained similarly to the corresponding surface integral in the last example. Using
(the Laplacian; see (3) in Sec. 9.8), we have by the divergence theorem and (3)
I IvondA = -K I I IdiV(grad U)dxdyd::.
S
T
(4)
=
-K I
I
T
2
I V Vdxdyd::..
SEC. 10.8
465
Further Applications of the Divergence Theorem
On the other hand, the total amount of heat H in T is
H =
JJJ
apU d:'(dydz
T
where the constant u is the specific heat of the material of the body and p is the density
volume) of the material. Hence the time rate of decrease of H is
_ aH
at
and
thi~
= _
JJJu pau
T
(=
mass per unit
d.ydvdz
at'
must be equal to the above amount of heat leaving T. From (4) we thus have
or
JJJ(up:)~
- K\' 2 U ) dxdydz = O.
T
Since this holds for any region T in the body, the integrand (if continuous) must be zero everywhere; that is,
(5)
c2
K
=--
up
where c 2 is called the thermal diJfusil'ity of the material. This partial differential equation is called the heat
equation. It is the fundamental equation for heat conduction. And our derivation is another impressive
demonstration of the great importance of the divergence theorem. Methods for solving heat problems will be
shown in Chap. 12.
The heat equation is also called the diffusion equation because it also models diffusion processes of motIOns
of molecules tending to level off differences in den,ity or pressure in gases or liquids.
If heat flow does not depend on time, it is called steady-state heat flow. Then aUlat = 0, so that (5) reduces
to Laplace's equation 'iJ 2U = O. We met this equation in Secs. 9.7 and 9.8, and we shall now see thaI the
divergence theorem adds basic insights into the nature of solutions of this equation.
•
Potential Theory. Harmonic Functions
The theory of solutions of Laplace's equation
(6)
is called potential theory. A solution of (6) with continuous second-order partial
derivatives is called a harmonic function. That continuity is needed for application of
the divergence theorem in potential theory, where the theorem plays a key role that we
want to explore. Further details of potential theory follow in Chaps. 12 and 18.
E X AMP L E 3
A Basic Property of Solutions of Laplace's Equation
The integrands in the divergence theorem are div F and F' n (Sec. 10.7). If F is the gradient of a scalar function,
say. F = grad f, then div F = div tgrad f) = 'iJ2f ; see (3). Sec. 9.8. Also, F' n = n' F = n' grad f. TIris is
the directional derivative of f in the outer normal direction of S. the boundary surface of the region T in the
theorem. This derivative is called the (outer) normal derivative of f and is denoted by aflan. Thus the formula
in the divergence theorem becomes
CHAP. 10
466
Vector Integral Calculus. Integral Theorems
(7)
This is the three-dimensional analog of (9) in Sec. 10.4. Because of the assllmptions in the divergence theorem
this gives the following result.
•
THEOREM 1
r
I
X AMP L E 4
A Basic Property of Harmonic Functions
Let f(x, y, z) be a harmonic function in some domain D is space. Let S be any
piecewise smooth closed orientable st/1jace in D whose entire region it encloses
belongs to D. Then the integral of the nonna/ derivative of f taken over S is -;ero.
(For "piecewise smooth" see Sec. 10.5.)
Green's Theorems
Let f and g be scalar functions such that F = f grad g satisfies the assumptions of the divergence theorem in
some region T. Then
div F = div (f grad g)
=
iJg
iJg
iJg
div ([ f -;- . f -;- . f -;iJx
iJy
iJz
J)
Also, since f is a scalar function,
Fon = noF
=
no(fgradg)
=
(n grad g)f.
0
Now n° grad g is the direcl10nal derivative ag/iJll of g in the outer normal direction of S. Hence the formula in
the divergence theorem becomes "Green's first formula"
(8)
JJJ(fV2
T
g
+ grad f-grad g) dV
=
JJf an dA.
iJg
S
Formula (8) together with the assumptions is known as thefirstform of Greel1's theorem.
Interchanging f and g we obtain a similar formula. Subtracting this formula from (8) we find
(9)
This formula is called Green's second formula or (together with the assumptions) the secolldform ofGreell's
theorem.
•
SEC. 10.8
467
Further Applications of the Divergence Theorem
E X AMP L E 5
Uniqueness of Solutions of Laplace's Equation
Let I be harmonic in a domain D and let I be zero everywhere on a piecewise smooth closed orientable surface
S in D whose entire region T it encloses belongs to D. Then V 2 g is zero in T. and the surface integral in (8) is
zero, so that (8) with g = I gives
IIJ
grad I . grad I dV =
IJI
T
Igrad 112 dV = O.
T
Since I is harmonic, grad I and thus Igrad II are continuous in T and on S, and since Igrad II is nonnegative,
to make the integral over T zero. grad I must be the zero vector everywhere in T. Hence Ix = I y = I z = O.
and f is constant in T and, because of continuity, it is equal to its value 0 on S. This proves the following
theorem.
THEOREM 2
Harmonic Functions
Let j"(x, y, z) be harmonic in some dOll/ain D and zero at eVel)' point of a piecewise
smooth closed orientable suiface S in D whose entire region T it encloses belongs
to D. Then f is identically zero in T.
This theorem has an important conseq LIenee. Let II and 12 be functions that satisfy the assumptions of Theorem
I and take on the same values On S. Then their difference II - 12 satisfies those assumptions and has the value
o everywhere on S. Hence, Theorem 2 implies that
II -h=O
throughout
T,
and we have the following fundamental result.
THEOREM 3
Uniqueness Theorem for laplace's Equation
Let T be a region that satisfies the assumptions of the divergence theorem, and let
f(.\", y, z) be a hal11lOnic function in a domain D that contains T and its /JoundGl)'
surface S. Then f is uniquely detennined in T by its values on S.
The problem of determining a solution u of a partial differential equation in a region T such that u assumes
given values on the boundary surface S of Tis called the Dirichlet problem.8 We may thus reformulate Theorem
3 as follows.
THEOREM 3*
Uniqueness Theorem for the Dirichlet Problem
if the assumptions in Theorem 3 are satisfied and the Dirichlet problem for the
Laplace equation has a solution in T, then this solution is unique.
These theorems demonstrate the extreme importance of the divergence theorem in potential theory.
•
8PETER GUSTAV LEJEUNE DIRICHLET il805-1859), German mathematician, studied in Paris LInder
Cauchy and others and sLlcceeded Gauss at G6ttingen in 1855. He became known by his important research on
Fourier series (he knew Fourier personally) and in number theory.
468
CHAP. 10
Vector Integral Calculus. Integral Theorems
1. (Hannonic functions) Verify Theorem 1 for
f = 2x2 + 2y2 - 4z 2 and S the surface of the cube
o~ x
2.
3.
4.
5.
6.
I, 0 ~ y ~ 1, 0 ~ z ~ 1.
(Hannonic functions) Verify Theorem 1 for
f = y2 - x 2 and the surface of the cylinder
x 2 + y2 ~ I, 0 ~ z ~ 5.
(Green's first formula) Verify (8) for f = 3y2,
g = x 2 , S the surface of the cube in Prob. I.
(Green's first formula) Verify (8) for f = x,
g = y2 + ;:2. S the surface of the box 0 ~ x ~ 1,
o ~ Y ~ 2, 0 ~ z ~ 3.
(Green's second formula) Verify (9) for the data in
Prob.3.
(Green's second formula) Verify (9) for f = x4,
g = y2 and the cube in Prob. l.
=
S
~
=
II
(a)
:!
dA =
I
IIlgrad gl2 dV.
T
S
II (f
(c)
tP
8. Find the volume of a ball of radius a by means of the
formula in Prob. 7.
9. Show that a region T with boundary surface S has the
volume
S
Of) dA = O.
og - g
on
011
(d) If ofIan = fJglon on S, then f = g
c is a constant.
+ c in T, where
(e) The Laplacian can be represented independently
of coordinate systems in the form
v2
=
lim
_1_
f d(T)~O VeT)
IIXdydz
JI of
dA
an
S(T)
where d(T) is the maximum distance of the points of a
region T bounded by SeT) from the point at which the
Laplacian is evaluated and veT) is the volume of T.
S
=
g
S
(b) If aglan = 0 on S, then g i8 constant in T.
where r is the distance of a variable point P: (x, y, z)
on S from the origin 0 and is the angle between the
directed line OP and the outer normal of Sat P.(Make
a sketch.)
V=
II(XdydZ + ydzdx + zdxdy).
10. TEAM PROJECT. Divergence Theorem and
Potential Theory. The importance of the divergence
theorem in potential theory is obvious from (7)-(9)
and Theorems I - 3. To emphasize it further, consider
functions f and g that are harmonic in some domain D
containing a region Twith boundary surface S such that
T satisfies the assumptions in the divergence theorem.
Prove and illustrate by examples that then:
~ IJrcostPdA
3
~
S
7. (Volume as a surface integral) Show that a region T
with boundary surface S has the volume
V=
IIZdxdy
IIyd::dx
S
10.9
Stokes's Theorem
Having seen the great usefulness of Gauss's divergence theorem, we now tum to the
second "big" theorem in this chapter, Stokes's theorem. This theorem transforms line
integrals into surface integrals and conversely. Hence it generalizes Green's theorem of
Sec. 10.4. Stokes's theorem involves the curl
j
(1)
curl F
= a/ax
k
(see Sec. 9.9).
SEC.10.9
469
Stokes's Theorem
THEOREM 1
Stokes's Theorem 9
{Transformation Between Surface and Line Integrals}
Let S be a piecewise STllooth 9 oriented suiface in space and let the boundary of S
be a piecewise smooth simple closed curve C. Let F(x, y, z) be a continuous vector
function that has continuous first partial derivatives in a domain in space containing
S. Then
JsJ(curl F)en dA = f Fer' (s) ds.
(2)
C
Here n is a unit nonnal vector of S and, depending on n, the integration around C
is taken in the sense shown in Fig. 251. Furthermore, r' = dr/ds is the unit tangent
vector and s the arc length of C.
In components, formula (2) becomes
(2*)
=
f_(Fl dx + F2 dy + F3 dz).
C
Here,
F = [Flo F2 , F3]'
N = [Nl , N 2, N3]'
n dA = N du du,
r' ds = [dx. dy, dz]. and R is the region with boundary curve C in the uv-plane
corresponding to S represented by r(u. v).
z
The proof follows after Example 1.
(\
~c
r'~
J
r'
Fig. 251.
E X AMP L E 1
x
n
Stokes's theorem
Fig. 252.
y
Surface 5 in Example 1
Verification of Stokes's Theorem
Before we prove Stokes's theorem, let
(Fig. 252)
LIS
first get L1sed to it by verifying it for F = [y,
z, xl and S the paraboloid
z ~ O.
Solution. The curve C, oriented as in Fig. 252, is the circle r(s) = [cos s, sin s, 0]. Its unit tangent vector
is r' (s) = I-sin s, cos s, 0]. The function F = [y, z, x] on Cis F(r(s)) = [sin s, 0, cos s]. Hence
f
J
27T
Fodr
=
F(r(s))or'(s)ds=
2'17"
J
[(sins)(-sins)+O+O]ds=-'7T.
C O O
9 Sir GEORGE GABRIEL STOKES (l819-1903).lrish mathematician and physicist who became a professor
in Cambridge in 1849. He is also known for his important contribution to the theory of infinite series and to
viscous flow (Navier-Stokes equations), geodesy, and optics.
"Piecewise smooth" curves and surfaces are defined in Sees. 10.1 and 10.5.
CHAP. 10
470
Vector Integral Calculus. Integral Theorems
We now consider the surface integral. We have Fl = y. F2 = :, F3 = X. so that in (2*) we obtain
curlF= Cllrl[Fl'
F3 ] = cllrl[y.
F2,
"
-I.
xj=[-1.
-11.
A normal vector of Sis N = grad(;: - J(x, y)) = [2.-.2,'. I]. Hence (curl F)oN = -2\' - 2y - I. Now
n dA = N dx dy (see (3'') in Sec. 10.6 with x, y instead of II, u). Using polar coordinates r. e defined by
x = r cos e, y = r sin e and denoting the projection of S into the x\'-plane by R. we thus obtain
f I(curIF)ondA = f I<CLlrlF)ON dxdy = I f<-2X - 2y - I)dxdy
S
R
R
I I
f (- f
2.".
=
1
(-2r(cos8+ sin 8) - I)rdr£le
8=0 7·=0
2.".
=
(cos
e+
sin 8)
O~O
PROOF
-1)
d8 = 0 + 0
-1
(21T)
=
-1T.
•
We prove Stokes's theorem. Obviously, (2) holds if the integrals of each component on
both c;ides of (2*) are equal; that is,
{I(
(3)
rc
aaF?l N.2 - -.a FN3
l ) du dv = ,( Fl dx
ely
(4)
(5)
We prove this first for a surface S that can be represented simultaneously in the forms
(6)
(a)
;:
=
f(x, y),
We prove (3), using (6a). Setting u
r(u, v)
=
y
(b)
=
g(x, .:),
(c)
x
=
h(y, ;:).
= x, v = y, we have from (6a)
rex, y)
= [x, y, f(x, y)] =
xi
+ yj +
fk
and in (2), Sec. 10.6. by direct calculation
Note that N is an upper normal vector of S, since it has a positive z-component. Also,
R = S*, the projection of S into the x,v-plane, with boundary curve E = C* (Fig. 253).
Hence the left side of (3) is
(7)
I I [aF
S*
l
(-fy) -
az
aFlJ
dxdv.
iJy'
We now consider the light side of (3). We transform this line integral over E = C* into
a double integral over S* by applying Green's theorem [formula (1) in Sec, 10.4 with
F2 = 0]. This gives
,( Fldx =
Jc *
ffS*
aFl dxd)'.
ay
SEC. 10.9
471
Stokes's Theorem
~
::n
z
1
1
S
1
1
1
1
1
1
1
1------
~y
~
C*
x
Fig. 253.
Here, Fl
= FI(x, y,
Proof of Stokes's theorem
f(x, y)). Hence by the chain rule (see also Prob. 10 in Problem Set 9.6),
(!Fl(X, y, f(x, y))
iJFl(X, y, z)
iJy
iJy
Cly
(Jz
[z = f(x. y)].
We see that the right side of this equals the integrand in (7). This proves (3). Relations
(4) and (5) follow in the same way if we use (6b) and (6c), respectively. By addition we
obtain (2*). This proves Stokes's theorem for a surface S that can be represented
simultaneously in the forms (6a), (6b), (6c).
As in the proof of the divergence theorem, our result may be immediately extended to
a surface S that can be decomposed into finitely many pieces, each of which is of the kind
just considered. This covers most of the cases of practical imerest. The proof in the case
of a most general surface S satisfying the assumptions of the theorem would require a limit
•
process; this is similar to the situation in the case of Green's theorem in Sec. lOA.
E X AMP L E 2
Green's Theorem in the Plane as a Special Case of Stokes's Theorem
Let F = IFl' F21 = F1 i + F2 j be a vector function that is continuously differentiable in a domain in the
\"y-plane containing a simply connected bounded closed region S whose boundary C is a piecewise smooth
simple closed curve. Then. according to (I),
(curIF)on
aF
aF
l
= (curlF)ok = ---2 - ---.
ax
ay
Hence the formula in Stokes's theorem now takes the form
II( a~2
s
rJx
-
(IFI) dA
=
Ay
J.
'j c
(F1 dx -'- F2 dy).
This shows that Green's theorem in the plane (Sec. 10.4) is a special case of Stokes's theorem (which we needed
•
in the proof of the latter!).
E X AMP L E 3
Evaluation of a Line Integral by Stokes's Theorem
Evaluate f c For' ds, where C is the circle x 2 +
y2 =
4, z = - 3, oriented counterclockwise as seen by a person
standing at the origin, and. with respect to right -handed Cartesian coordinates.
i
~ 4 in the plane;: = - 3.
Then n in Stokes's theorem points in the po~itive ;:-direction; thus n = k. Hence (curl F)on is simply the
compone~t of curl F in the positive ~-direction. Since F with;:: = -3 has the components F1 = y, F2 = -27x,
F3 = 3y , we thus obtain
Solutioll. As a surface S bounded by C we can take the plane circular disk x2 +
(curl F)on
(JF
iJF
iJx
iJy
= -.-2 - ---1 =
-27 - I
=
-28.
CHAP. 10
472
Vector Integral Calculus. Integral Theorems
Hence the integral over S in Stokes· s theorem equals - 28 times the area 47T of the disk S. This yields the answer
-28' 47T = -1127T = -352. Confirm this by direct calculation, which involves somewhat more work.
•
E X AMP L E 4
Physical Meaning of the Curl in Fluid Motion. Circulation
Let ST be a circular disk of radius "0 and center P bounded by the circle CTo (Fig. 254), and let
O
F(Q) == F(x, y, :::) be a continuously differentiable vector function in a domain containing ST • Then by Stokes's
O
theorem and the mean value theorem for sUiface integrab (see Sec. 10.6),
where ATo is the area of S'o and
Fig. 254.
P~
is a
~uitable
point of S"o. This may be written in the form
Example 4
In the case of a fluid motion with velocity vector F = v, the integral
is called the circulation of the t10w around C ro . It measures the extent to which the corresponding fluid motion
is a rotation around the circle CTO • If we now let ro approach zero, we find
(8)
that is. the component of the curl in the positive normal direction can be regarded
(circulation per unit area) of the flow in the sUiface at the corresponding point.
E X AMP L E 5
a~
the specific circulation
•
Work Done in the Displacement around a Closed Curve
Find the work done by the force F = 2ry3 sin::: i + 3x\2 sin::: j + x 2 cos::: k in the displacement around the
curve of intersection of the paraboloid z = x2 + y2 and the cylinder (r - 1)2 + y2 = l.
.l
Solutioll. This work is given by the line integml in Stokes's theorem. Now F = grad f, where f = X 2 y3 sin:::
and curl(grad f) = 0 (see (2) in Sec. 9.9). so that (cur! F)-n = 0 and the work is 0 by Stokes's theorem. This
agrees with the fact that the present field is conservative (definition in Sec. 9.7).
•
Stokes's Theorem Applied to Path Independence
We emphasized in Sec. 10.2 that the value of a line integral generally depends not only
on the function to be integrated and on the two endpoints A and B of the path of integration
C, but also on the particular choice of a path from A to B. In Theorem 3 of Sec. 10.2 we
proved that if a line integral
I F(r)odr = I
(9)
C
c
(FI dx
+ F2 dy + F3 d;:;)
(involving continuous F], F2 , F3 that have continuous first partial derivatives) is path
independent in a domain D, then curl F = 0 in D. And we claimed in Sec. 10.2 that.
conversely. curl F = 0 everywhere in D implies path independence of (9) in D provided
D is simply connected. A proof of this needs Stokes's theorem and can now be given as
follows.
Let C be any closed path in D. Since D is simply connected. we can find a surface S
in D bounded by C. Stokes's theorem applies and gives
fc
(Fl dx
+ F2 dy + F3 d;:;)
=
fc For' ds
=
J J(curl F)on dA
s
473
Chapter 10 Review Questions and Problems
for proper direction on C and nonnal vector n on S. Since curl F = 0 in D, the surface
integral and hence the line integral are zero. This and Theorem 2 of Sec. 10.2 imply that
the integral (9) is path independent in D. This completes the proof.
•
-
..-.- .
11-81
DIRECT INTEGRATION OF THE SURFACE
INTEGRALS
Evaluate the integral
F and S.
1. F = [4Z2,
2. F = [0,
16x,
0,
s: x 2 + y2
3. F =
II
=
0
0], s: Z = Y (0 ~ x ~ 1, 0 ~ y ~ I)
5x cos z],
4, Y ~ 0, 0 ~
eZ,
[-e Y ,
(curl F) n dA directly for the given
S
z~
~7T
eX],
s: Z = x + y (0 ~ x ~ 1, 0 ~ y ~ 1)
4. F = [3 cos y, cosh z, x],
S the square 0 ~ x ~ 2, 0 ~ y ~ 2, z
2Z
Z
=
18. F =
9. Verify Stokes's theorem for F and S in Prob. 7.
10. Verify Stokes's theorem for F and S in Prob. 8.
EVALUATION OF
f
For' ds
c
Calculate this line integral by Stokes's theorem, clockwise
as seen by a person standing at the origin, for the following
F and C. Assume the Cartesian coordinates to be righthanded. (Show the details.)
: - -_11"
[z,
x,
y]. C as in Prob. 13
7 ~
?
....
__
7. F = [Z2. ~x, 0],
S the square 0 ~ x ~ a, 0 ~ y ~ a, Z = 1
8. F = [y3. -x3, 0], S: x 2 + y2 ~ I. Z = 0
111-181
13. F = [y2, x2, -x + z], around the triangle with
vertices (0, 0, I). (I. O. I), (1, 1, I)
14. F = [y, xy3, - Zy 3],
C the circle x 2 + y2 = a 2, Z = b (> 0)
IS. F = [y, Z2, x 3 ], C as in Prob. 12
16. F = [x 2, y2, Z2],
C the intersection of x 2 + y2 + Z2 = 4 and z = y2
17. F = [cos 7T)" sin 7TX, 0], around the rectangle with
vertices (0, 1,0), (0, 0, I), (1, 0, I), (1, 1. 0)
4
Z
S. F = [e , e sin y, e cos y],
S: Z = y2 (0 ~ X ~ 4, 0 ~ y ~ I)
2 = x 2 + ),2, ]
2
7
,,2:
.(., X-,
_v 2 ] , S'.....
_ 0, 0 ~
_
6• F = [_2
11. F = [-3y. 3x. z], C the circle x 2 + y2 = 4. z = 1
12. F = [4z, -2x, 2x],
C the intersection of x 2 + )'2 = I and z = y + 1
20. WRITING
PROJECT. Grad, Div, Curl in
Connection with Integrals. Make a list of ideas and
results on this topic in this chapter. See whether you
can rearrange or combine parts of your material. Then
subdivide the material into 3-5 portions and work out
the details of each portion. Include no proofs but simple
typical examples of your own that lead to a better
understanding of the material.
..
1. List the kinds of integrals in this chapter and how the
integral theorems relate some of them.
2. How can work of a variable force be expressed by an
integral?
3. State from memory how you can evaluate a line integral.
A double integral.
4. What do you remember about path independence? Why
is it important?
5. How did we Use Stokes's theorem in connection with
path independence?
6. State the definition of curl. Why is it important in this
chapter?
7. How can you transform a double integral or a surface
integral into a line integral?
f
Fo r' ds,
c
F = (x 2 + y2)-1[ -y,x], C: x 2 + y2 = I, z = 0, oriented
clockwise. Why can Stokes's theorem not be applied?
What (false) result would it give?
19. (Stokes's theorem not applicable) Evaluate
AND PROBLEMS
8. What is orientation of a surface? What is its role in
connection with surface integrals?
9. State the divergence theorem and its applications from
memory.
10. State Laplace's equation. Where in physics is it
important? What properties of its solutions did we
discuss?
111-201
LINE INTEGRALS
I
(WORK INTEGRALS)
F(r)odr
C
Evaluate. with F and C as given, by the method that seems
most suitable. Recall that if F is a force, the integral gives
the work done in a displacement along C. (Show the details.)
11. F =
[x 2•
y2,
Z2],
C the straight-line segment from (4, I, 8) to (0, 2, 3)
474
CHAP. 10
Vector Integral Calculus. Integral Theorems
I
25. I
12. F = [cos;::, -sin z, -x sin;:: - y cos ;::]. C the
straight-line segment from (-2. 0, ~'iT) to (4. 3. 0)
13. F = [x - y, 0, eZ ],
C: y = 3x2 , Z = 2x for x from 0 to 2
24.
14. F = [yz, 2;::x, xy],
C the circle x 2 + y2 = 9,
126-35 1
Z
=
=
x2
+ )'2, R: x 2 + )'2
= 2x2,
~ I, x ~ 0, y ~ 0
R the region below y = x + 2 and above
)' = x 2
SURFACE INTEGRALS
ff
l, counterclockwise
15.F=[-3v3 , 3x3 +cosy. 0].
C the circle x 2 + )'2 = 16. z = 0, counterclockwise
16. F = [sin 10', cos m:, sin 17X].
C the boundary of 0 ~ x ~ 112, 0 ~ y ~ 2, z = 2x
Evaluate this integral directly or. if
divergence theorem. (Show the details.)
17. F = [9z, 5x, 3.\'],
C the ellipse x 2 + )'2 = 9. z = x + 2
18. F = [cosh x, e 4y , tan z], C: x 2 + )'2 = -1-,
(Sketch C.)
19. F = [Z2. x 3• y2], C: x 2 + )'2 = 4, x + Y +
27. F = [yo -x. 0].
S: 3 t' + 2 Y + z = 6, x
2
Z = x .
Z =
0
2
= [x •
y2, )'2X], C the helix
20. F
r = [2 cos I. 2 sin I, 61] from (2. O. 0) to (0. 2,
~... 1-251
317)
DOUBLE INTEGRALS,
CENTER OF GRAVITY
Find the coordinmes .i. y of the center of gravity of a mass
of density I(x. y) in the region R. (Sketch R. Shmv the
details.)
21. I = 2x)" R the triangle with vertices (0, 0), (1, 0),
22.
23.
Fon dA
5
26. F = [2X2, 4.", 0],
S: x + y + z = 1, x ~ 0, y
~
~
0, y
0, z
~
pos~ible.
~
0, z
by the
0
~
0
28. F = [x - y, y - z, z - x],
S the sphere of radius 5 and center 0
29. F = [y2, x 2, Z2].
S the surface of x 2 +
y2
~ 4, 0 ~ Z ~ 5
30. F = [-,,3, x 3 , 3z 2 ],
S the portion of the paraboloid z = x2 +
31. F = [sin2 x, -y sin 2x, 5;::].
S the sul1'ace of the box Ixl ~ a,
32. F = [1,
33. F
=
[x,
I,
xy,
a]. S: x
2
z], S: x 2
y2,
Iyl ~ b, Izl
z~4
~ c
+ )'2 + 4;::2 = 4, z ~ 0
+ y2 = I, 0 ~ z ~ h
(1, I)
34. F as in Prob. 33, S the complete boundary of
x2 + )'2 ~ I, 0 ~ z ~ II
I
I
35. F = leY, 0, ze X ]. Sthe rectangle with vertices (0, O. 0).
(1.2,0), (0, O. 5), (1, 2, 5)
R: 0 ~ y ~ I - x 2
= 1. R: x 2 + y2 ~ a 2, y ~ 0
= I,
. ......
.:..:...
:.
. ........ _\I...
1ft
.... -
Vector Integral Calculus. Integral Theorems
Chapter 9 extended differential calculus to vectors, that is, to vector functions
vex, y, z) or vet). Similarly. Chapter 10 extends integral calculus to vector functions.
This involves line integrals (Sec. 10.1), double integrals (Sec. 10.3), swface
integrals (Sec. 10.6), and triple integrals (Sec. 10.7) and the three "big" theorems
for transforming these integrals into one another, the theorems of Green (Sec. 10.4),
Gauss (Sec. 10.7), and Stokes (Sec. lO.9).
The analog of the definite integral of calculus is the line integral (Sec. 10.1)
(1)
where C: r(t) = [x(t), y(t), z(t)] = x(t)i + y(t)j + z(t)k (a ~ t ~ b) is a curve in
space (or in the plane). Physically. (I) may represent the work done by a (variable)
force in a displacement. Other kinds of line integrals and their applications are also
discussed in Sec. 10.1.
Summary of Chapter 10
475
Independence of path of a line integral in a domain D means that the integral
of a given function over any path C with endpoints P and Q has the same value for
all paths from P to Q that lie in D; here P and Q are fixed. An integral (1) is
independent of path in D if and only if the differential form Fl dx + F2 dy + F3 dz
with continuous F I , F2 • F3 is exact in D (Sec. LO.2). Also, if curl F = 0, where
F = [Fl' F2 , F3]' has continuous first partial derivatives in a simp/" connected
domain D, then the integral (1) is independent of path in D (Sec. 10.2).
Integral Theorems. The formula of Green's theorem in the plane (Sec. 10.4)
(2)
II( -iJF2
ax
R
l
- -iJF
ay
)
dr: dy
.
=
T
c
(F dx
I
+
F dy)
2
.
transforms double integrals over a region R in the xy-plane into line integrals over
the boundary curve C of R and conversely. For other forms of (2) see Sec. lOA.
Similarly, the formula of the divergence theorem of Gauss (Sec. 10.7)
(3)
I I I div F dV = I I F- n dA
T
S
transforms triple integrals over a region T in space into surface integrals over the
boundary surface S of T. and conversely. Formula (3) implies Green's formulas
(4)
III (f'\P g
+ Vf-Vg)dV=
T
IIf
S
~g
an
dA,
(5)
Finally, the formula of Stokes's theorem (Sec. 10.9)
(6)
IsI(curl F)-n dA Tc F-r' (s) ds
=
transforms surface integrals over a surface S into line integrals over the boundary
curve C of S and conversely.
PA RT
c
Fourier Analysis.
•••
Partial
Differential
Equations
C HAP T E R 11
Fourier Series, Integrals, and Transforms
C HAP T E R 1 2
Partial Differential Equations (PDEs)
Fourier analysis concerns periodic phenomena, as they occur quite frequently in
engineering and elsewhere-think of rotating parts of machines, alternating electric
currents, or the motion of planets. Related periodic functions may be complicated. This
situation poses the important practical task of representing these complicated functions in
terms of simple periodic functions. namely. cosines and sines. These representations will
be infinite series, called Fourier series. l
The creation of these series was one of the most path-breaking events in applied
mathematics, and we mention that it also had considerable influence on matl1ematics as
a whole, on the concept of a function. on integration theory, on convergence tl1eory for
series. and so on (see Ref. [OR7] in App. 1).
Chapter II is concerned mainly with Fourier series. However, the underlying ideas can
also be extended to nonperiodic phenomena. This leads to Fourier integrals and
fransjonl1s. A common name for the whole area is Fourier analysis.
Chapter 12 deals witl1 the most important partial differential equations (PDEs) of physics
and engineering. This is the area in which Fourier analysis has its most basic applications,
related to boundary and initial value problems of mechanics, heat flow, electrostatics, and
other fields.
IJEAN-BAPTISTE JOSEPH FOURIER (1768-1830). French physicist and mathematician, lived and taught
in Paris. accompanied Napoleon in the Egyptian War. and was later made prefect of Grenoble. The beginnings
on Fourier series can be found in works by Euler and by Daniel Bernoulli, but it was Fourier who employed
them in a systematic and general manner in his main work, Theorie allalyflque de la chaleur (Analytic Theory
of Heat. Paris, 1822). in which he developed the theory of heat conduction (heat equation; see Sec. 12.5), making
these series a most important tool in applied mathematics.
-
477
·
/
.'
.
CHAPTER
\
11
Fourier Series, Integrals,
and Transforms
Fourier series (Sec. 11.1) are infinite series designed to represent general periodic
functions in terms of simple ones, namely. cosines and sines. They constitute a very
important tool, in particular in solving problems that involve ODEs and PDEs.
In this chapter we discuss Fourier series and their engineering use from a practical point
of view, in connection with ODEs and with the approximation of periodic functions.
Application to PDEs follows in Chap. 12.
The theory of Fourier series is complicated. but we shall see that the application of these
series is rather simple. Fourier series are in a certain sense more universal than the familiar
Tay lor series in calculus because many discontinuous periodic functions of practical interest
can be developed in Fourier series but, of course, do not have Taylor series representations.
In the last sections (11.7-11.9) we consider Fourier integrals and Fourier transforms,
which extend the ideas dnd techniques of Fourier series to nonperiodic functions and have
basic applications to PDEs (to be shown in the next chapter).
Prerequisite: Elementary integral calculus (needed for Fourier coefficients)
Sections that lIlay be nmitted in a shorter course: 11.4-11.9
References alld Answers to Problems: App. 1 Part C. App. 2.
11.1
Fourier Series
Fourier series are the basic tool for representing periodic functions, which play an
important role in applications. A function f(x) is called a periodic function if f(x) is
defined for all real x (perhaps except at some points, such as x = ±7T!2, ±37T/2, ... for
tan x) and if there is some positive number p. called a period of f(x). such that
(1)
f(x
+ p) =
f(x)
for all x.
The graph of such a function is obtained by periodic repetition of its graph in any interval
of length p (Fig. 255).
Familiar periodic functions are the cosine and sine functions. Examples of functions
that are not periodic are x, x 2 , x 3 , eX, cosh x, and In x, to mention just a few.
If f(x) has period p, it also has the period 2p because (I) implies
f(x + 2p) = f([x + p] + p) = f(x + p) = f(x), etc.; thus for any integer 11 = 1,2,3, .. "
(2)
478
f(x
+ np) =
f(x)
for all x.
SEC. 11.1
479
Fourier Series
{(x)
x
Fig. 255.
Periodic function
Furthermore if f(x) and g(x) have period p, then af(x) + bg(x) with any constants a and
b also has the period p.
Our problem in the first few sections of this chapter will be the representation of various
functions f(x) of period 217 in terms of the simple functions
(3)
I,
cos x,
cos 2x,
sin x,
sin 2x, ... ,
cos
In:,
sin
/lX, . • • .
All these functions have the period 27T. They form the so-called trigonometric system. Figure
256 shows the fIrst few of them (except for the constant 1, which is periodic with any period).
The series to be obtained will be a trigonometric series, that is, a series of the form
ao + a1 cos x
(4)
+ bi
ao
=
sin x
+ .L
+ a2 cos 2\'" + b 2 sin 2x +
(an cos
+ b n sin nx).
IlX
n~I
ao, Lib b l . a2, b2, ... are constants, called the coefficients of the series. We see that each
term has the period 27T. Hence if the coefficients are such that the series converges, its
sum will be a function of period 27T.
It can be shown that if the series on the left side of (4) converges, then inserting
parentheses on the right gives a series that converges and has the same sum as the series
on the left. This justifIes the equality in (4).
Now suppose that f(x) is a given function of period 27T and is such that it can be
represented by a series (4), that is, (4) converges and, moreover, has the sum f(x). Then,
using the equality sign, we write
f(x)
(5)
= ao + .L (an cos nx + bn sin nx)
n~I
:\vnv
/:\ L
o
:\
f\,!\
2n
cos 2x
cos x
V\
V
sin x
cos 3x
1f!\.
2n
V
sin 2x
Fig. 256.
Cosine and sine functions having the period 2IT
Sin
3x
(,
480
CHAP. 11
Fourier Series, Integrals, and Transforms
and call (5) the Fourier series of f(x). We shall prove that in this case the coefficients
of (5) are the so-called Fourier coefficients of f(x), given by the Euler formulas
(a)
ao =
- f
I
71"
27T
-71"
-f
I
(6)
an =
(b)
7T
7T
f(x) cos
I1X
dx
n = 1.2.···
f(x) sin
11-1:
dx
11
-71"
f
I
bn = 7T
(c)
f(x) dx
7T
= 1,2, ....
-7T
The name "Fourier series" is sometimes also used in the exceptional case that (5) with
coefficients (6) does not converge or does not have the sum f(x)-this may happen but
is merely of theoretical interest. (For Euler see footnote 4 in Sec. 2.5.)
A Basic Example
Before we derive the Euler formulas (6). let us become familiar with the application of
(5) and (6) in the case of an important example. Since your work for other functions will
be quite similar, try to fully understand every detail of the integrations, which because of
the 11 involved differ somewhat from what you have practiced in calculus. Do not just
routinely use your software, but make observations: How are continuous functions (cosines
and sines) able to represent a given discontinuous function? How does the quality of the
approximation increase if you take more and more terms of the series? Why are the
approximating functions, called the partial sums of the series, always zero at 0 and 7T?
Why is the factor lin (obtained in the integration) important'?
E X AMP L E 1
Periodic Rectangular Wave (Fig. 257a)
Find the Fourier coefficients of the periodic function f(x) in Fig. 257a. The formula is
(7)
f(x) =
-k
{ k
if
-71"<X<O
if
O<X<71"
and
f(x
+
271")
=
f(x).
Functions of this kind occur as external forces acting on mechanical systems, electromotive forces in electric
circuits, etc. (The value of f(x) at a single point does not affect the integral: hence we can leave f(x) undefined
at x = 0 and x = 2:71".)
Solution.
From (6a) we obtain ao = O. This can also be seen without integration, since the area under the
curve of f(x) between -71" and 71" is zero. From (6bl.
an =
-
I
7r
f'" f(x) cos nxdx
=
-'iT
I
7T
[
f
0
'IT
(-k)COSI1Xdx+f kcosl1xdx
0
7r
[
]
0
-'iT
nx
sin -k n
1
-7T
sin-nx
+k
n
I"']
cO
0
because sin nx = 0 at -71", 0, and 71" for all n = 1, 2, .... Similarly, from (6cl we obtain
bn =
~
71"
f'"_'" f(x) sin nx dx =
7r
[f
O(-k)
1 [ kcos
- nx
n
'IT
sin nx dx
+
f'" k sin 17X dX]
0
_"
/0
-'iT
- kcos
-n.r
- /"'] .
n
0
SEC. 11.1
481
Fourier Series
-n
n
0
-----l-k
L
2n
x
1- J
(a) The gIven function {(x) (Periodic rectangular wave)
x
"
'~<
n
/
x
4k sin 3x
3"
'-,
n
...... _ /
x
4k sin 5x
5"
(b) The first three partial sums of the corresponding Fourier series
Fig. 257.
Eample 1
Since cos ( -a) = cos a and cos 0 = 1, this yields
bn =
k
nn
[cos 0 - cos (-n7T)
-
cos n7T
+
cos 0] =
U
~
nn
(1 - cos n7T).
Now, cos 71" = -1, cos 271" = 1, cos 371" = -1, etc.; in general,
cos n71" = {
-I
I
for odd n,
e
for odd n,
and thus
for even n,
I - cosn71" =
for cven n.
Hence the Fourier coefficients h n of our function are
4k
4k
h5 = 571" '
CHAP. 11
482
Fourier Series, Integrals, and Transforms
Since the an are 7ero, the Fourier series of f(x) is
+ ..!..
sin 3x + ..!.. sin 5x + ... )
3
5
.
4k (Sin x
(8)
7T
The partial sums are
4k
S = 4k (sin x
Sl = ~sinx,
2
7T
+ ..!..
3
sin
3X) '
etc.,
Their graphs in Fig. 257 ,eem to indicate that the series is convergent and has the sum f(x), the given function.
We notice that at x = 0 and x = 7T, the points of discontinuity of f(x), all partial sums have the value zero, the
arithmetic mean of the limits -k and k of our function, at these points.
Furthermore, assuming that f(x) is the sum of the series and setting x = 7TI2, we have
thus
1
1
1
7T
1--+---+-···=-.
3
5
7
4
This is a famous result obtained by Leibniz in 1673 from geometric considerations. It illustrates that the value,
of various series with constant terms can be obtained by evaluating Fourier series at specific points.
•
Derivation of the Euler Formulas (6)
The key to the Euler formulas (6) is the orthogonality of (3), a concept of basic importance,
as follows.
THEOREM 1
Orthogonality of the Trigonometric System (3)
The trigonometric system (3) is orthogonal on the interval -7T ~ X ~ 7T (hence also
on 0 ~ x ~ 27T or any other interval of length 27T because of periodicity): that is,
the integral of the product of any two functions in (3) over that interval is 0, so that
for any integers nand nz,
(a)
J7T cos nx cos nIX dx = 0
(n
=/=-
m)
0
(n
=/=-
m)
0
(n
=/=-
m or n
-7T
(9)
J" sin nx sin mx dx =
(b)
-7T
J7T sin nx cos mx dx =
(e)
= m).
-7T
PROOF
This follows simply by transfonning the integrands trigonometrically from product'> into
sums. In (9a) and (9b), by (11) in App. A3.I,
I
1""
7T
cos nx cos nIX dx
-7T
= -
2
1
J cos (n + m)x dx + -2 J_.".cos (n -
m)x dx
-7T
J"" sin nx sin nzx dx = -2 J cos (n -7T
17T
7T
-7T
J
m)x dx - 2
J cos (n + m)x dx.
7T
-7T
SEC 11.1
483
Fourier Series
Since m * n (integer!), the integrals on the right are all O. Similarly, in (9c), for all integer
m and n (without exception; do you see why?)
I
~
sin nx cos mx dx = - J sin (n
_~
2 _~
~
J
~
+
m)x dr:
+
2
J
sin (n - lIl)x dr: = 0
_~
+ O. •
Application of Theorem 1 to the Fourier Series (5)
We prove (6a). Integrating on both sides of (5) from
~
~
L}(X)
dx = L~
[
ao
+ ~l
00
-7T
(an cos rue
to
7T,
we get
+ bn sin Itt)
]
dx.
We now assume that termwise integration is allowed. (We shall say in the proof of
Theorem 2 when this is true.) Then we obtain
The first tenn on the right equals 27Tao. Integration shows that all the other integrals are
O. Hence division by 27T gives (6a).
We prove (6b). Multiplying (5) on both sides by cos 11/X with any fixed positive integer
m and integrating from - 7T to 7T, we have
(10)
~
J_~f(X)
cos mx dx
=
~
J_~
[
ao
+ ~1
YO
(an cos nx
]
+ hn sin nx) cos mx dx.
We now integrate term by term. Then on the right we obtain an integral of ao cos mx.
which is 0; an integral of an cos nx cos 17U, which is am 7T for n = 11/ and 0 for n =/=- 111 by
(9a); and an integral of bn sin In cos 111X, which is 0 for all nand 111 by (9c). Hence the
right side of (10) equals a m 7T. Division by 7T gives (6b) (with 111 instead of n).
We finally prove (6c). Multiplying (5) on both sides by sin my with any fixed positive
integer 111 and integrating from - 7T to 7T, we get
(II)
~
L}(X)
sin mx dx =
LTi~ [ ao + ~l= (an cos nx + hn sin nx) ]
sin mx dr:.
Integrating term by term, we obtain on the right an integral of a o sin mx, which is 0; an
integral of an cos nx sin mx, which is 0 by (9c); and an integral of h n sin 11.)( sin llU", which
is hm 7T if n = 1ll and 0 if 17 =/=- m, by (9b). This implies (6c) (with n denoted by m). This
completes the proof of the Euler formulas (6) for the Fourier coefficients.
•
CHAP. 11
484
Fourier Series, Integrals, and Transforms
Convergence and Sum of a Fourier Series
The class of functions that can be represented by Fourier series is surprisingly large and
general. Sufficient conditions valid in most applications are as follows.
Representation
THEOREM 2
by a Fourier Series
Let f(x) he periodic with period 271" and pieceJ,vise cOlltinuous (see Sec. 6.1) in the
interval -71" ~ X ~ 71". Furthermore, let f(x) have a left-hand derivative and a
right-hand derivative at each point of that interval. Then the Fourier series (5) of
f(x) [with coefficients (6)] conver!!es. Its sum is f(x), except at points Xo where f(x)
is discontinuous. There the slim of the series is the average of the left- and
right-hand limits 2 of f(x) at Xo.
PROOF
We prove convergence in Theorem 2. We prove convergence for a continuous function
f(x} having continuous first and second derivatives. Integrating (6b) by parts, we obtain.
an
= -1 I71" f(x) cos nx dr: = f(x) sin IlX
71"
-71"
- - I I71" f ,(x) sin nx dt.
17T
-7T
n71"
n71"_7T
The first teml on the right is zero. Another integration by parts gives
an
= t' (.x) 2cos nx
171"
n 71"
-
I
n 71"
-2-
-7T
I7T f "(x) cos nx dx.
-7r
The firs I term on the right is zero because of the periodicity and continuity of f' (x). Since
f" is continuous in the interval of integration, we have
If"(x)1 <
M
for an appropriate constant M. Furthermore, Icos nxl ~ 1. It follows that
lanl =
-i-I
n 71" I
7T {'ex) cos nx dxl
<
-7T
-iI7T M dx
n 71"
=
-7T
2M
n2
.
f(x)
f(l- 0)
j~
o
2 The left-hand limit of f(x) at Xo is defined as the limit of f(x) as x approaches Xo from the left
and is commonly denoted by f(xo - 0). Thus
Left- and
right-hand limits
Fig. 258.
+ 0) =i
of the function
X2
I(x) =
{
x/2
if x < 1
0 through positive values.
+ h) as h --->
0 through positive values.
h~O
The right-hand limit is denoted by f(xo
f(xo
+ 0)
+ 0) and
= lim f(xo
11._0
1(1 - O} = 1,
1(1
~
f(xo - 0) = lim f(xo - Iz) as h
x
The left- and right-hand derivatives of f(x) at xo are defined as the limits of
f(x o - Iz) - f(x o - 0)
-Iz
and
f(xo
+ Iz)
- f(xo
+ 0)
It
respectively, as Iz ---> 0 through positive values. Of course if f(x) is continuous at X()o the last tenn in
both numerators is simply flxo).
SEC. 11.1
Fourier Series
485
Ibnl
Similarly,
< 2 Mln 2 for alln. Hence the absolute value of each teml of the Fourier
series of f(x) is at most equal to the corresponding term of the series
I
I + 2M ( 1 + 1 + -221 + -22I+ I
-32 +
-32 + ... )
la o
which is convergent. Hence that Fourier series converges and the proof is complete.
(Readers already familiar with uniform convergence will see that, by the Weierstrass test
in Sec. 15.5, under our present assumptions the Fourier series converges uniformly, and
our derivation of (6) by integrating term by term is then justified by Theorem 3 of
Sec. 15.5.)
The proof of convergence in the case of a piecewise continuous function f(x) and the
proof that under the assumptions in the theorem the Fourier series (5) with coefficients
(6) represents f(x) are substantially more complicated; see, for instance, Ref. [C121. •
E X AMP L E 2
Convergence at a Jump as Indicated in Theorem 2
The rectangular wave in Example I has a jump at x = O. Its left-hand limit there is -k and its right-hand limit
is k (Fig. 257). Hence the average of these limits is O. The Fourier series (8) of the wave does indeed converge
to this value when x = 0 because then all its terms are O. Similarly for the other jumps. This is in agreement
with Theorem 2.
•
Summary. A Fourier series of a given function f(x) of period 271' is a series of the form
(5) with coefficients given by the Euler formulas (6). Theorem 2 gives conditions that are
sufficient for this series to converge and at each x to have the value f(x), except at
discontinuities of f(x), where the series equals the arithmetic mean of the left-hand and
right-hand limits of f(x) at that point.
..
=J~
1. (Calculus review) Review integration techniques for
integrals as they are likely to arise from the Euler
formulas, for instance, definite integrals of x cos /lX,
x 2 sin I1X, e- 2x cos I1X, etc.
@-iJ
FUNDAMENTAL PERIOD
Theful1damental period is the smallest positive period. Find
it for
sin 2x,
2. cos x, sinx. cos 2x.
cos 27TX, sin 27TX
3. cos I1X.
27TI1X
sin nx.
cos - k - ,
27TX
cos -k-
cos 7TX,
sin 7TX.
27TX
6. (Change of scale) If f(x) has period 17, show that f(ax),
a *- O. and f(x/b) , b *- O. are periodic functions of x
of periods pia and bp, respectively. Give examples.
17-121
GRAPHS OF 21T"PERIODIC FUNCTIONS
Sketch or graph f(x), of period 27T, which for -7T < X <
is given as follows.
7. f(x) =x
8. f(x) = e- lxl
9. f(x)
11. f(x)
sin - k '
27TI1X
sin - -
12. f(x)
k
4. Show that f = COl1st is periodic with any period but
has no fundamental period.
S. If f(x) and g(x) have period p, show that
hex) = af(x) + bg(x) (a, b, constant) has the period p.
Thus all functions of period 17 form a vector space.
113-241
10. f(x)
Ixl
7T -
3
if
x3
if
{-X
Losl
~x
-7T
7T
Isin 2xI
< x < 0
O<X<7T
if-7T<x<O
if
O<x<
7T
FOURIER SERIES
Showing the details of your work, find the Fourier series
of the given f(x)' which is assumed to have the period 27T.
Sketch or graph the pattial sums up to that including
cos 5x and sin 5x.
486
CHAP. 11
lIn
I
13.
l _
-IT
0
24. f{x)
lIT
0
-IT
15.
16.
IT
1
""
"-
'.
+
~ sin 3x
! sin 5x + ... )
+ i sin 4x + i sin 6x
+
... )
+ ----z (cos x + 9 cos 3x + :l5 cos Sx + ... )
+ 4(cos x + - ... )
(c) ~~
i
cos 2x +
i cos 3x - -h cos 4x
27. CAS EXPERIMENT. Order of Fourier Coefficients.
The order seems to be lin if f is discontinous. and 11112
if f is continuous but f' = dfldx is discontinuous. 1In 3
if f and J' are continuous but fff is discontinuous, etc.
Try to verify this for examples. Try to prove it by
integrating the Euler formulas by parIs. Whal is the
practical significance of this?
Tr
'/
-Tr
7T
7T
Tr
0
18.
0 < x <
1 4 1
(b) 2
Tr
-Tr
if
< x < 0
- 2( i sin 2x
"2Tr
17.
4x
-7T
26. CAS EXPERIMENT. Graphing. Write a program for
graphing partial sums of the following series. Guess
from the graph what f(x) the series may represent.
Confirm or disprove your guess by using the Euler
IT
~
~
2
if
(a) 2{sinx
0
-Tr
-4X
{
formula~.
/~
-Tr
=
25. (Discontinuities) Verify the last statement in Theorem
2 for the discontinuities of f(x) in Prob. 13.
2
14.
Fourier Series, Integrals, and Transforms
28. PROJECT. Euler Formulas in Terms of Jumps
Without Integration. Show that for a function whose
third derivative is identically zero,
IT
0
an
19.
=
1l'iT
I
+
n2
1
-n
n7T
-
L is.ff.Sill nxs ]
Lj~ sinnxs
2
I ~ _ff
]
L..J is cos nxs
11
where n = I, 2, ... and we sum over all the jumps js,
J', J'. respectively. located atxs'
20.
j~,j; of f,
29. Apply the formulas in Project 28 to the function in
Prob. 21 and compare the results.
21. f(x) = x 2
22. f(x)
23. f{x)
( - 7T
= x 2 (0
<
<
X
X
< 27T)
<
7T)
30. CAS EXPERIMENT. Orthogonality. Integrate and
graph the integral of the product cos mx cos nx (with
various integer m and IJ of your choice) from -a to a
as a function of a and conclude orthogonality of cos
mx and cos nx (m *- Il) for a = 7T from the graph. For
what m and n will you get orthogonality for a = 7T/2,
rr/3, 7T14? Other a? Extend the experiment to cos mx
sin Il\: and sin 111 ,. sin l1X.
SEC. 11.2
11.2
Functions of Any Period
p = 2L
487
= 2L
Functions of Any Period p
The functions considered so far had period l7T, for the simplicity of the formulas. Of
course, periodi.c function~ in applications will generally have other periods. However, we
now show that the transition from period p = 27T to a period 2L is quite simple. The
notation p = 2L is practical because L will be the length of a violin string (Sec. 12.2) or
the length of a rod in heat conduction (Sec. 12.5), and so on.
The idea is simply to find and use a cha1lge of scale that gives from a function g(v) of
period 27T a function of period 2L. Now from (5) and (6) in the last section with g(v)
instead of I(x) we have the Fourier series
(1)
g(v)
""
= ao + 2:
(an cos flV
+ bn sin llv)
n=l
with coefficients
1
ao
J
27T
= - J
7T
= - J
7T
7T
= -
g(v) dv
-71"
1
(2)
lin
7T
g(v) cos flV dv
-7T
1
bn
7T
g(v) sin llV dv.
-7T
We can now write the change of scale as v = kx with k such that the old period v = 27T
gives for the new variable x the new period x = 2L. Thus, 27T = k2L. Hence k = 7TIL and
v = kx = 7TXIL.
(3)
This implies dv = (7TIL) dx. which upon substitution into (2) cancels 1I27T and 1I7T and
gives instead the factors 1I2L and IlL. Writing
(4)
g(v)
=
I(x),
we thus obtain from (1) the Fourier series of the function f(x) of period 2L
(5)
f(x)
1l7T
= ao + ~l
00
(
an cos
L
x
+ bn
1l7T
sin
L
x)
with the Fourier coefficients of f(x) given by the Euler formulas
1
(a)
(6)
(b)
(C)
1
an = L
1
bn = L
J
L
ao = -;;...L
f(x) dx
-L
JL f(x) cos -ll7TX- dx
-L
JL I(x) sin -ll7TX- dx
-L
11
=
1,2, ...
n
=
1,2, ...
L
L
488
CHAP. 11
Fourier Series, Integrals, and Transforms
Just as in Sec. 11.1, we continue to call (5) with any coefficients a trigonometric series.
And we can integrate from 0 to 2L or over any other interval of length p = 2L.
E X AMP L E 1
Periodic Rectangular Wave
Find the Fourier series of the function (Fig. 259)
f(x) =
o
if
-2 < x < -1
Ok
if
-1
{
if
Solution.
<
<
1
1<..1.'<
2
x
p
=
2L
= 4.
= 2.
L
From (6a) we obtain ao = kl2 (verify!). From (6h) we ohtain
an
I {/(X)
=
11;'
cos
=
dx
I f/
,~: sin 1127T .
=
cos 11;..1.' dx
Thus an = 0 if 11 is even and
an
= 2kln7T
if n
=
1, 5, 9, ....
= 0 for 11 =
From (6c) we find that b n
k
f(x) = -
2
+ -2k
7T
(
an = -2kll17T if
1,2..... Hence the Fourier series is
7T
1
37T
cos - x - - cos - x
. 2
3
2
I
+ -5
cos
'
~b
-2
-1
57T
+ ... )
-..I. -
2
I
1'-----:!~:---'
0
Fig. 259.
E X AMP L E 2
= 3,7, 11, ....
n
x
Example 1
Periodic Rectangular Wave
Find the Fourier
serie~
of the function (Fig. 260)
if
-2 < x < 0
kif
0<..1.'<2
-k
f(x) = {
Solution. ao
=
L = 2.
p = 2L = 4,
0 from (6a). From (6bt with IlL = 112,
an =
2
[f
a
Il1iX
_2(-k) cos -2- dt:
+
{2
0
n7TX
0
2 [
_
2k sin 117TX 1
n7T
2 -2
+
]
k cos -2- dx
2
2k sin 1l7TX 1
1l7T
2 0
J
= O.
so that the Fourier series has no cosine terms. From (6c).
0
bn
2
[2k
fl7TX
1l7TX
cos - 1 - -2k cos -1
2
1l7T
2 -2
1l7T
2 0
=
-I
=
117T
k
(I - cos n7T - cos
1l7T
+ 1)
=
J
{4klll7T
if
0
if
Il
II
=
L 3•...
= 2, 4, ....
.
•
SEC lU
Functions of Any Period p
=lL
489
Hence the Fourier series of f(.1:) is
~
f(x)
4k (Sin ~ x + ~ sin 37T x + ~ sin 57T x + ... )
7T
2
3
2
5
2
.
It is interesting that we could have derived this from (8) in Sec. 11.1, namely, by the scale change (3). Indeed.
writing v instead of x, we have in (8), Sec. 11.1,
:
+
~ sin 3v +
(sin v +
sin 5v + ... ) .
Since the period 27T in v corresponds to 2L = 4, we have k = 7TIL = 7T12 and v = kx = TTXI2 in (3); hence we
obtain the Fourier series of f(x), as before.
•
{(x)
k
x
2
L-
'-------j-k
~,
-rr/m
Example 2
Fig. 260.
EXAMPLE 3
""'6
o
rr/m
Half-wave rectifier
Fig. 261.
Half-Wave Rectifier
A sinusoidal voltage E sin WT. where T is time. is passed through a half-wave rectifier that clips the negative
portion of the wave (Fig. 261). Find the Fourier series of the resulting periodic function
if
E sin
Solution.
-L
0
u(t) = {
< t < O.
p = 2L
wi
~
L=
W
0< r < L
if
7T
W
Since u = 0 when -L < t < 0, we obtain from (6a), with t instead of x,
f7C/W
W
ao
= -
27T
0
E
E sin wt dt = 7T
and from (6b), by using formula (11) in App. A3.1 with x = wt and y = I1wt,
an
=
!:'!... f7C/W E sin WT cos I1WT dt
7T
~E
_7T
=
0
If 11 = I, the integral on the right is zero, and if 11
an =
=
f",IW[Sin (1 +
+ sin (1 -
I1)M]
(~
:
~7T +
If 11 is odd, thi, is equal to zero, and for even
11
we have
cos <I - I1)Wt ] 7C/w
(I - l1)w
0
+ _-_c_o_s_<_I_-_")_7T_+_1 ) .
1 - 11
2E
(11 -
In a similar fashion we find from (tiC) that b i = E12 and bn
u(t) =
~
+
f
sin wt -
z:
dt.
2, 3, ... , we readily obtain
wE [_ cos (1 + Il)wt
27T
(1 + I1)W
2: (-cos
111M
0
~
=
0 for
1)(11
11 =
~
+ 1)7T
(Il = 2,4, .. ').
2,3, .... Consequently,
(1 3 cos 2mt + 3 5 cos 4wt + .. -) .
•
CHAP. 11
490
11-111
Fourier Series, Integrals, and Transforms
FOURIER SERIES FOR PERIOD P = lL
Fmd the Fourier series of the function f(x), of pedol! p = 2L,
and sketch or graph the first three partial sums. (Show the
details of your work.)
1. f(x) = -1 (-2 < x < 0). f(x) = 1 (0 < x < 2). p = 4
2. f(x) = 0 (- 2 < x < 0). f(x) = 4 (0 < x < 2). p = 4
3. f(x) = x 2 ( - I < x < 1), p = 2
4. f(x)
7Tx 312 (-I < x < I), p = 2
5. f(x)
sin TTX (0 < X < I), P = 1
6. f(x)
cos TTX
< x < ~), p = 1
7.f(x)
Ixl (-l<x<l). p=2
I + x if - I < x < 0
8. f(x) = { 1 _ x if 0 < x < 1. p = 2
(-4
9. f(x) = I - ~2 (-1 < x < I), P = 2
10. f(x) = 0 (-2 < x < 0), f(x) = x (0 < x < 2), p = 4
ll.f(x)=-x (-I<x<O), f(x)=x (O<x<I).
f(x) = 1 tl < x < 3), p = 4
12. (Rectifier) Find the Fourier series of the function
obtained by passing the voltage v(t) = Vo cos 100m
through a half-wave rectifier.
13. Show that the familiar identities
cos3 x =! cos x + ~ cos 3x and
sin 3x can be interpreted as
sin3 x = ~ sin x Fourier series expansions. Develop cos4 x.
!
11.3
14. Obtain the series in Prob. 7 from that in Prob. 8.
15. Obtain the series in Prob. 6 from that in Prob. 5.
16. Obtain the series in Prob. 3 from that in Prob. 21 of
Problem Set 11.1.
17. Using Prob. 3, show that
I -
!+~
18. Show that I
-
k
+ - . . . =
fz7T
+! + ~ + k + ...
2
= ~7T2.
19. CAS PROJECT. Fourier Series of 2L-Periodic
Functions. (a) Write a program for obtaining partial
sums of a Fourier series (1).
(b) Apply the program to Probs. 2-5. graphing the first
few partial sums of each of the four series on common
axes. Choose the first five or more partial sums until
they approximate the given function reasonably well.
Compare and comment.
20. CAS EXPERIMENT. Gibbs Phenomenon. The
partial sums ,1'n(X) of a Fourier series show oscillations
near a discontinuity point. These oscillations do not
disappear as 1l increases but instead become sharp
"spikes." They were explained mathematically by
1. W. Gibbs3 • Grdph sn(x) in Prob. 10. When 11 = 50.
"ay. you will see those oscillations quite distinctly.
Consider other Fourier series of your choice in a similar
way. Compare.
Even and Odd Functions.
Half-Range Expansions
The function in Example 1, Sec. 11.2, is even, and its Fourier series has only cosine
terms. The function in Example 2, Sec. 11.2, is odd, and its Fourier series has only sine
terms.
Recall that g is even if g( - x) = g(x), so that its graph is symmetric with respect to the
vertical axis (Fig. 262). A function h is odd if h( - x) = - hex) (Fig. 263).
Now the cosine terms in the Fourier series (5), Sec. I L.2. are even and the sine terms
are odd. So it should not be a surprise that an even function is given by a series of
cosine terms and an odd function by a series of sine terms. Indeed, the following holds.
3 JOSIAH WILLARD GIBBS (1839-1903). American mathematician. professor of mathematical physics at
Yale from 1871 on. one of the founders of vector calculus [another being O. Heaviside (see Sec. 6.1)],
mathematical thermodynamics. and statistical mechanics. His work was of great importance to the development
of mathematical physics.
SEC. 11.3
491
Even and Odd Functions. Half-Range Expansions
y
y
x
Fig. 262.
THEOREM 1
Even function
Fig. 263.
Odd function
Fourier Cosine Series, Fourier Sine Series
The Fourier series of an even function of period 2L is a "Fourier cosine series"
ro
(1)
f(x) =
£/0
+ 2:
117T
an cos
L
(f even)
X
n=l
with coefficients (note: integration from 0 to L only!)
1
(2)
ao
=
-
L
J
L
f(x) dx,
0
2 JL
an = L
0
tl7TX
f(x) cos - - dx,
L
n
=
1,2, .. '.
The Fourier series of an odd function of period 2L is a "Fourier sine series"
ro
(3)
=
f(x)
2:
bn sin
n7T
L
x
(f odd)
n=l
with coefficients
(4)
PROOF
2 JL
n7TX
f(x) sin - - (h.
L 0
L
bn = -
Since the definite integral of a function gives the area under the curve of the function
between the limits of integration. we have
L
J
J
L
g(x) d>::
=2
-L
J
g(x) dx
for even g
0
L
hex) dx = 0
for odd h
-L
as is obvious from the graphs of g and h. (Give a formal proof.) Now let f be even. Then
(6a), Sec. 11.2, gives ao in (2). Also, the integrand in (6b), Sec. 11.2, is even (a product
of even functions is even), so that (6b) gives an in (2). Furthermore, the integrand in (6c),
Sec. 11.2, is the even f times the odd sine, so that the integrand (the product) is odd, the
integral is zero, and there are no sine terms in (1).
492
CHAP.11
Fourier Series, Integrals, and Transforms
Similarly, if f is odd. the integrals for ao and an in (6a) and (6b). Sec. 11.2. are zero,
f times the sine in (6c) is even. (6c) implies (4), and there are no cosine terms in (3). •
oc
If L =
The Case of Period 27T.
coefficients
I
(2*)
ao
=-
7f
7f,
+
then f(x) = ao
~ an cos nx (f even) with
n~l
2
f f(x) dx,
'iT
an
= -
0
7f
f f(x) cos nx dx,
'iT
n = 1,2, ...
0
co
and f(x) = ~ bn sin nx (f odd) with coefficients
n=l
2
(4*)
bn
=
7f
f
'iT
f(x) sin nx dx,
n
= 1,2,···.
0
For instance, f(x) in Example I, Sec. ILl, is odd and is represented by a Fourier sine
series.
Further simplifications result from the following property, whose very simple proof is
left to the student.
THEOREM 2
Sum and Scalar Multiple
The Fourier coefficients of a sum h + f2 are the sums of the corresponding Fourier
coefficients of f 1 and f 2·
The Fourier coefficients of cf are c times the corresponding Fourier coefficiencs
off·
E X AMP L E 1
Rectangular Pulse
The function f"(x) in Fig. 264 is the sum of the function f(x) in Example I of Sec 11.1 and the constant k.
Hence. from that example and Theorem 2 we conclude that
f*(x) = k
E X AMP L E 2
4k
+ -:;
( sin x
+ "31
sin 3x
+
5"1
sin 5x
•
+ . .. ) .
Half-Wave Rectifier
The function u(t) in Example 3 of Sec. 11.2 has a Fourier cosine series plus a single term vCr) = (E/2) sin wi.
We conclude from this and Theorem 2 that U(l) - Vel) must be an even function. Verify this graphically. (See
Fig. 265.)
•
y
[*(x)
2k
-1r
o
Fig. 264.
Example 1
Fig. 265.
u(t) - v(t) with E = 1,
W
= 1
SEC. 11.3
493
Even and Odd Functions. Half-Range Expansions
EXAM PLE 3
Sawtooth Wave
Find the Fourier series of the function (Fig. 266)
f(x) = x
+
7T if
-7T < x
(a)
(b)
<
7T
l(x
and
+ 27T)
=
f(x).
The functionf(x)
Partial sums 81> 8 2 , 8 3, 8 20
Example 3
Fig. 266.
Solution. We have f = iI + f2' where h = x and f2 = 7T. The Fourier coefficients at f2 are zero, except
for the first one (the constant term). which is 7T. Hence, by Theorem 2. the Fourier coefficients an' bn are those
of iI, except for ao, which is 7T. Since iI is odd, an = 0 for n = 1,2, ... , and
bn
=
2.7T J('iIlX)sin ny dx = 2.7T J("x sinllx cL--.:.
o
o
Integrating by parts, we obtain
b
n
2
7T
=-
[
--XCOSIlX
11
Hence b i = 2, b 2 = - 2/2, bs = 2/3, b4
f(x) = 7T
+
I'" + - f'"cosl1xcL>:]
1
0
=
11
= -
0
2 COSl17T.
-
11
-214, ... , and the Fourier series of f(x) is
2 (Sin x -
~
sin 2x
+
~
sin 3x -
+ ... ) .
•
Half-Range Expansions
Half-range expansions are Fourier series. The idea is simple and useful. Figure 267
explains it. We want to represent f(x) in Fig. 267a by a Fourier series. where f(x) may
be the shape of a distorted violin string or the temperature in a metal bar of length L, for
example. (Corresponding problems will be discussed in Chap. 12.) Now comes the idea.
494
CHAP. 11
Fourier Series, Integrals, and Transforms
[(x)~
L
x
(a) The given function [(x)
-L
(b) [(x)
x
L
extended as an even periodic function of period 2L
(e) [(x) extended as an odd periodic function of period 2L
Fig. 267.
(a) Function fIx) given on an interval 0
~
x
~
L
(b) Even extension to the full "range" (interval) -L ~ x ~ L (heavy curve)
and the periodic extension of period 2L to the x-axis
(c) Odd extension to -L
~ x ~ L (heavy curve) and the periodic extension
of period 2L to the x-axis
We could extend I(x) as a function of period L and develop the extended function into a
Fourier series. But this series would in general contain both cosine and sine terms. We
can do better and get simpler series. Indeed, for our given I we can calculate Fourier
coefficients from (2) or from (4) in Theorem l. And we have a choice and can take what
seems more practicaL If we use (2). we get (1). This is the even periodic extension II
of f in Fig. 267b. If we choose (4) instead. we get (3), the odd periodic extension I2 of
I in Fig. 267c.
Both extensions have period 2L. This motivates the name half-range expansions: I is
given (and of physical interest) only on half the range, half the interval of periodicity of
length 2L.
Let us illustrate these ideas with an example that we shall also need in Chap. 12.
E X AMP L E 4
"Triangle" and Its Half-Range Expansions
Find the two half-range expansions of the function (Fig. 268)
L
2k
L
x
o
Ll2
Fig. 268. The given
function in Example 4
T
f(x) =
X
{ 2k
T(L -
Solution. (a) E,'en periodic extell.~ion.
an
2
=L
[2k
X}
if
O<x<2"
if
2"<x<L.
L
From (2) we obtain
rUzxcosT xdx + T2k fLuz(L-x)cosT
117T xdx] .
T J
o
1l7T
SEC. 11.3
495
Even and Odd Functions. Half-Range Expansions
We cunsider an' For the first integral we obtain by integration by parts
r
J
0
~ sin
l2
X
cos nTT x (iT =
L
nTT
2
I
L fL/ S1I1 nTT x dx
nTT 0
L
nTT x L/2
L
0
L2 sin nTT
211TT
2
+
1) .
2L22 (cos 112TT _
n TT
Similarly, for the second integral we obtain
f
L
nTT
x dx = -
(L - x) cos -
L/2
L
L
11TT
(L - x) sin -
nTT
IL
x
L
+
L/2
L
11TT
f
L
nTT
sin -xdx
L
L/2
We insert these two results into the formula for an' The sine terms cancel and so does a factor L2. This gives
(2 cos 112TT - cos n TT -
4k
n 2 TT2
1) .
Thus,
and a" = 0 if n
* 2.6. 10. 14..... Hence the first half-range expansion of
f(x) =
~
2
-
~ (~
22
TT2
cos 2TT X
L
+
~2
cos 6TT X
L
6
f(x) is (Fig. 269a)
+ ... ) .
This Fourier cosine series represents the even periodic extension of the given function f(x), of period 2L.
(b) Odd periodic exte11sio11. Sunilarly, from (4) we obtain
8k
11TT
bn = 2 2 sin - .
n 1T
2
(5)
Hence the other half-range expansion of f(x) is (Fig. 269b)
8k
f(X) = TT2
(
1
TT
1
12 sin LX - 32
3TT
sin LX
+
52
5TT
sin L X -
This series represents the odd periodic extension of f(x), of period 2L.
Basic applications of these results will be shown in Sees. 12.3 and 12.5.
-L
o
•
x
L
(a) Even extension
x
(b) Odd extension
Fig. 269.
Periodic extensions of [(xl in Example 4
CHAP. 11
496
Fourier Series, Integrals, and Transforms
S£E?H:r-~--
[I~
EVEN AND ODD FUNCTIONS
12. fIx) =
Are the following functions even. odd. or neither even nor
odd?
1. lxi, x 2 sin IIX, x + x 2 • e- 1xl , In x, x cosh x
2. sin (X2), sin 2 x, x sinh x, Ix3 1, e=-, xe x , tan 2x, xlO + x2)
Are the following functions, which are assumed to be
periodic of period 27T. even. odd, or neither even nor odd?
3. lex) = x 3 ( - 7T < X < 7T)
4. lex) = x 2 (-7T/2 < x < 37T12)
5. f(x) = e- 4x (-7T < x < 7T)
lex) = x 3 sin x (-7T < X < 17)
7. lex) = xlxl - x 3 (-17 < X < 7T)
8. lex) = I - x + x 3 - x 5 (-7T <
9. f(x) = 1/(1 + \"2) if -17 < x < o. f(x)
X
< 7T)
=
-1/(1
+ x2)
ifO<x<7T
10. PROJECT. Even and Odd Functions. (a) Are the
following expressions even or odd? Sums and products
of even functions and of odd functions. Products of
even times odd functions. Absolute values of odd
functions. f(x) + f( -xl and f(x) - f( -x) for arbitrary
f(x).
(b) Write ekx , lI(l - x). sin (x + k), cosh (x + k) as
sums of an even and an odd function.
(c) Find all functions that are both even and odd.
(d) Is cos3 \" even or odd? sin 3 x? Find the Fourier
series of these functions. Do you recognize familiar
identities?
111-161
FOURIER SERIES OF EVEN AND ODD
FUNCTIONS
11.4
13. f(x) =
{
17-X
< 1712
7T12 < x < 31712
if
if-17<x<O
14.
lex)
if 0
e
if
15. f(x)
-2
if 0
lex)
117-251
=c
- !Ixl
<X<17
<x<O
<x<2
if
-2
< x <
2
(p
0
2<x<6
if
=
8)
HALF-RANGE EXPANSIONS
Find (a) the Fourier cosine series, (b) the Fourier sine serie~.
Sketch J(x) and its two periodic extensions. (Show the
details of your work.)
17. f(x) = I
(0 < x < 2)
18. f(x) = x
(0
19.
lex)
= 2 -
20.
lex)
=
21. f(x)
< x < ~)
(0 < x < 2)
(0 < x < 2)
x
o
{1
(2
1
-- {2
(0
(I
22. f(x) = { x
7T/2
23. f(x) = x
< x <
< x <
4)
I)
< x < 2)
(0 < x < 7T12)
(1712 < x < 7T)
< x < L)
(0 < \. < L)
X
(0 < x < 17)
(0
24. f(x) = x 2
Is the given function even or odd? Find its Fourier series.
Sketch or graph the function and some partial sums. (Show
the details of your work.)
11. lex) = 17 - Ixl (-7T < x < 7T)
(-I<x<l)
if
-1712 < x
X
16.
6.
2xlxl
25. f(x) = 7T -
26. Illustrate the formulas in the proof of Theorem I with
examples. Prove the formulas.
Complex Fourier Series.
Optional
In this optional section we show that the Fourier series
(1)
f(x)
=
ao
+~
(an cos
IIX
+ bn
sin I1X)
n~l
can be written in complex form, which sometimes simplifies calculations (see Example 1,
on page 498). This complex form can be obtained because in complex, the exponential
function e it and cos t and sin t are related by the basic Euler formula (see (11) in Sec. 2.2)
(2)
eit = cos t
+
i sin T.
Thus
e- it = cos t - i sin t.
SEC. 11.4
Complex Fourier Series.
497
Optional
Conversely, by adding and subtracting these two fonnulas, we obtain
(b)
(3)
1.
.
sin t = _(e't - e- lt ).
2i
From (3), using 1Ii = -i in sin t and setting t = nx in both formulas, we get
1
.
1nx
-2 (an - ib.,o}e
.
We insert this into (1). Writing ao =
we get from (l)
+ "2I
!(an - ibn)
Co'
(a n
.
-'nx
lbn )e • .
+
= Cn' and !(an + ibn) = kn ,
00
(4)
f(x)
=
+
Co
~ (cne inx + kne- inx ).
n=1
The coefficients Cl' C2, • • • • and klo k2 •
then (2) above with t = nx.
1
C
n
= -2 (an - ibn) = -
1
27T
f
•.•
are obtained from (6b), (6c) in Sec. 11.1 and
7T
f(x)(cos nx - i sin llX) dx
_"
= -
1
27T
f
7T
•
f(x)e- mx dx
-TT
(5)
1
kn = - (an
2
+
ibn)
f
27T
= -
1
f(x)(cos
llX
+
f
27T
1
7T
i sin 1LX) dx = -
-7T
7T
_
f(x)e 1nx dx.
-7T
Finally, we can combine (5) into a single formula by the trick of writing kn =
(4). (5), and Co = ao in (6a) of Sec. ll.l give (summation from -cx::!)
en'
Then
00
cnein.r ,
f(x) = ~
n=-co
(6)
= -
C
11.
1
f
7T
f(x)e- tnx dx,
11
27T_-rr
= O. ±1, ±2, .. '.
This is the so-called complex fOl"l/l of the Fourier series or, more briefly, the complex
Fourier series, of f(x). The Cn are called the complex Fourier coefficients of f(x).
For a function of period 2L our reasoning gives the complex Fourier series
00
f(x)
(7)
=
~
Cnein7rxlL,
n=-x
11
= 0,
±l, ±2,···.
498
E X AMP L E 1
CHAP. 11
Fourier Series, Integrals, and Transforms
Complex Fourier Series
Find the complex Fourier series of fex) = eX if -7T < x < 7T and f(x
Fourier series.
+ 27T)
= f(x)
and obtain from it the usual
Solution. Since sin n7T = 0 for integer n, we have
e"'in7T = cos n7T ::':: i sin n7T = cos n7T = (-I)n.
With this we obtain from (6) by integration
en =
27T
ITT'
eXe- inx dx =
eX-inxl""
__
1_
X~-7T
27T 1 - in
-7T
On the right,
I
I - in
+ in
(I - in)(I
+
I
in)
I
+ in
+ n2
e 7T - e- 7T = 2 sinh 7T.
and
Hence the complex Fourier ,erie, is
sinh 7T
(8)
7T
00
L
I + in .
(_l)n - - - 2 - e1nx
n~-oo
I
+
From this let us derive the real Fourier series. Using (2) with t
(I + il1)i
nx
= (I + ill)(cos IU: +
Now (8) also has a corresponding term with
sin (-In) = -sin IIX, we obtain in this term
i sin In)
-II
( - 7T <
= l1X and i 2 =
= (cos nx -
X
< 7T).
n
-1, we have in (8)
n sin l1x) + ;(n cos nx
+ sin l1x).
instead of n. Since cos (-nx) = cos IU: and
(I - in)e- inx = (I - ;n)(cos nx - i sin nx) = (cos l1X - n sin IU:) -
i(n cos nx
+ sin nx).
If we add these two expressions, the imaginary parts cancel. Hence their sum is
2(cos nx - n sin nx),
n = 1,2,···.
For II = 0 we get I (not 2) because there is only one term. Hence the real Fourier series is
(9)
2sinh7T[1
I
- - - - - (cos x - sin x)
7T
2
I + 12
eX = - - -
I
+ -- (cos il - 2 sin il) - + ..
1 + 22
J
In Fig. 270 the poor approximation near the jumps at ::'::7T is a case of the Gibbs phenomenon (see CAS
Experiment 20 in Problem Set 11.2).
•
y
25
~
20
15
10·
/
5/
-lC,
Fig. 270.
o
lC
Partial sum of (9), terms from n
=
X
0 to 50
SEC. 11.5
499
Forced Oscillations
1. (Calculus review) Review complex numbers.
2. (Even and odd functions) Show that the complex
Fourier coefficients of an even function are real and
those of an odd function are pure imaginary.
3. (Fourier coefficients) Show that
ao = Co, an = C n + C- n , b n = i(c n - c- n )·
4. Verify the calculations in Example 1.
S. Find further temlS in (9) and graph partial sums with
your CAS.
6. Obtain the real series in Example 1 directly from the
Euler formulas in Sec. II.
[7-131
10. Convert the series in Prob. 9 to real form.
X
< 7r)
12. Convert the series in Prob. II to real form.
13. f(x)
= x
(0
< x < 27r)
14. PROJECT. Complex Fourier Coefficients. It is very
interesting that the Cn in (6) can be derived directly by
a method sinlllar to that for an and b n in Sec. 11.1. For
this, mUltiply the series in (6) by e- imx with fixed
integer m, and integrate term wise from -7r to 7r on
both sides (allowed, for instance, in the case of uniform
convergence) to get
COMPLEX FOURIER SERIES
I7T f(x)e- imx dx = ~
Find the complex Fourier series of the following functions.
(Show the details of your work.)
7. f(x) = -1 if - 7r < X < 0, f(x) = 1 if 0 < x < 7r
8. Convert the series in Prob. 7 to real form.
9. f(x) = x (-7r < X < 7r)
11.5
(-7r <
11. f(x) = x 2
cn
I7T ei(n-m)x dx.
n=-OO-71"
-7r
Show that the integral on the right equals 27r when
n = m and 0 when n =1= m [use (3b)], so that you get
the coefficient formula in (6).
Forced Oscillations
Fourier series have important applications in connection with ODEs and PDEs. We show
this for a basic problem modeled by an ODE. Various applications to PDEs will follow
in Chap. 12. This will show the enormous usefulness of Euler's and Fourier's ingenious
idea of splitting up periodic functions into the simplest ones possible.
From Sec. 2.8 we know that forced oscillations of a body of mass m on a spring of
modulus k are governed by the ODE
(1)
my"
+ cy' + f....), =
ret)
where y = yet) is the displacement from rest, c the damping constant, k the spring constant
(spring modulus), and r(t) the external force depending on time t. Figure 271 shows the
model and Fig. 272 its electrical analog, an RLC-circuit governed by
c
R
L
E(t)
Fig.271.
Vibrating system under
consideration
Fig. 272. Electrical analog of the
system in Fig. 271 (RLC-circuit)
500
CHAP.11
Fourier Series, Integrals, and Transforms
Ll"
(1*)
I
+ RT' + -
C
T = E' (t)
(Sec. 2.9).
We consider (1). If ret) is a sine or cosine function and if there is damping (c > 0),
then the steady-state solution is a harmonic oscillation with frequency equal to that of r(t).
However, if r(t) is not a pure sine or cosine function but is any other periodic function,
then the steady-state solution will be a superposition of harmonic oscillations with
frequencies equal to that of r(t) and integer multiples of the latter. And if one of these
frequencies is close to the (practical) resonant frequency of the vibrating system (see
Sec. 2.8), then the corresponding oscillation may be the dominant patt of the response of
the system to the external force. This is what the use of Fourier series will show us. Of
course, this is quite surprising to an observer unfamiliar with Fourier series, which are
highly important in the study of vibrating systems and resonance. Let us discuss the entire
situation in terms of a typical example.
E X AMP L E 1
Forced Oscillations under a Nonsinusoidal Periodic Driving Force
In (I), let
In
= 1 (gm), C = 0.05
(gmfsec), and k
y"
(2)
where r(t) is measured in gm • cmfsec
t
2
.
= 25 (gmfsec 2 ), so that (1)
+ 0.05/ +
25y = r(t)
Let (Fig. 273)
+~
if
-7T<t<O,
if
O<t<7T,
2
r(t) =
{
-( +
becomes
r(t
IT
2"
+
27T) = r(t).
Find the steady-state solution yet).
Fig. 273.
Solution.
We represent ret) by a Fourier series. finding
(3)
r(t) =
~
(cos t
7T
(take the answer
(4)
Force in Example 1
[0
+
~
3
cos 3t
+
~
52
cos 5t + ... )
Prob. 11 in Problem Set 11.3 minus ~7T and write
.v
"
+ 0.05y. I + 251'-
= -24- cos Ilt
11
t
(n
for x). Then we consider the ODE
= 1. 3.... )
7T
whose right side is a single term of the series (3). From Sec. 2.8 we know that the steady-state solution vn(t)
of (4) is of the form
(5)
Yn = An cos I1f
+ Bn sin nt.
SEC. 11.5
SOl
Forced Oscillations
By substituting this into (4) we find that
(6)
0.2
An =
Since the ODE (2)
where
i~
linear. we may expect the steady-state solution to be
(7)
Y = .1'1
+ )'3 + Y5 + ...
where )'n is given by (5) and (6). In fact, this follows readily by substituting (7) into (2) and using the Fourier
series of r( t), provided that termwise differentiation of (7) is permissible. (Readers already fami) iar with the notion
of uniform convergence [Sec. 15.51 may prove that (7) may be diilerentiated term by term.)
From (6) we find that the amplitude of (5) is (a factor Vii;. cancels out)
Numeric values are
C 1 = 0.0531
C3
=
0.0088
C5 = 0.2037
C7
=
0.0011
C9 = 0.0003.
Figure 274 shows the input (multiplied by 0.1) and the output. For n = 5 the quantity Dn is very small. the
denominator of C5 is small, and C5 is so large that Y5 is the dominating term in (7). Hence the output is almost
a harmonic oscillation of five times the frequency of the driving force, a little distorted due to the term Yl, whose
amplitude is about 25% of that of Y5' You could make the situation still more extreme by decreasing the damping
•
constant c. Try it.
y
0.3
Fig. 274.
Input and steady-state output in Example 1
1. (Coefficients) Derive the fonnula for en from An and Bn.
2. (Spring constant) What would happen to the amplitudes
en in Example 1 (and thus to the fonn of the vibration)
if we changed the spring constant to the value 97 If we
took a stiffer spring with k = 817 First guess.
3. (Damping) In Example I change c to 0.02 and discuss
how this changes the output.
4. (Input) What would happen in Example I if we
replaced ret) with its derivative (the rectangular wave)?
What is the ratio of the new en to the old ones?
502
CHAP.11
15-111
Fourier Series, Integrals, and Transforms
114-171
GENERAL SOLUTION
w 2y = ret)
Find a general solution of the ODE y" +
r(t) as given. (Show the details of your work.)
Find the steady-state oscillation of y" + c/ + Y = r(t)
with c > 0 and ret) as given. (Sho\'i the details of your
work.)
14. ret) = an cos III
15. r(t) = sin 3t
if -7T12 < t < ,,12
7Tt
with
5. r(t) = cos wt, w = 0.5, 0.8, U, 1.5, 5.0, 10.0
6. r(t) = cos WIt
+ cos
w2t
(w
2
2
=1= W1 , W22)
N
7. r(t) =
2
an cos
Ilt,
Iwl
1, 2, ... , N
=1=
16. reT)
n=1
8. r(t)
sin t +
l
sin 3t +
t+7f if
9. r(t)
{
and
r(t
=
-t
+
+
{
r(t
+
and
17. ret) =
O<t<7T
27T) = ret),
Iwl
=1=
0, 1,3,
- 7T12 < t <
7f( 7T - t)
+
r(t
if
7T12
<
t
<
3,,/2
27f) = ret)
if
+
7T
"4
7T12
Isin
t/
< t < 37T12
Iwl
27T) = reT),
=1=
1,3,5, . . .
if -7T < t < 7T and
27T) = ret).
Iwl
=1=
o.
2
b n sin nt
18. CAS EXPERIMENT. Maximum Output
Graph and discus~ outputs of y" + cy' + /...y
with r(t) as in Example I for various c and
emphasis on the maximum Cn and its ratio
second largest Icni.
7T12
Term.
= ret)
k with
to the
~9_-~
RLC-CIRCUIT
Find the steady-state current I(t) in the RLC-circuit in
Fig. 272, where R = 100 n, L = 10 H, C = 10- 2 F and
E(t) V as follows and periodic with period 27f. Sketch or
graph the first four partial sums. Note that the coefficients
of the solution decrease rapidly.
19. E(t) = 200t( 7T2 - t 2 ) (- 7T < t < 7f)
2. 4 . . . .
12. (CAS Program) Write a program for solving the ODE
just considered and for jointly graphing input and
output of an initial value problem involving that ODE.
Apply the program to Probs. 5 and 9 with initial values
of your choice.
13. (Sign of coefficients) Some An in Example 1 are positive
and some negative. Is this physically understandable?
11.6
{
n=l
(
r(t
11. ret) =
t sin 7t
-7T<t<O
7f if
7T-
and
+
N
if
10. r(t)
! sin 5t
STEADY-STATE DAMPED OSCILLATIONS
100 (7Tt
20. E(t) = {
+
lOO( 7Tt -
t 2)
if
- 7f < t < 0
(2)
if
0 < t < 7T
Approximation by Trigonometric Polynomials
Fourier series playa prominent role in differential equations. Another field in which they
have major applications is approximation theory, which concerns the approximation of
functions by other (usually simpler) functions. In connection with Fourier series the idea
is as follows.
Let lex) be a function on the interval -7T" ~ X ~ 7f that can be represented on this
interval by a Fourier series. Then the Nth partial sum of the series
N
(1)
f(x)
= 00 +
2:
(on cos nx
+ bn sin nx)
n=l
is an approximation of the given f(x). It is natural to ask whether (l) is the "best""
approximation of f by a trigonometric polynomial of degree N, that is, by a function
of the form
N
(2)
F(x)
=
Ao
+
2:
(An cos nx
+
Bn sin nx)
(N fixed)
n=l
where "best" means that the "error" of the approximation is as small as possible.
SEC. 11.6
503
Approximation by Trigonometric Polynomials
Of course, we must first define what we mean by the error E of such an approximation.
We could choose the maximum of If - Fl. But in connection with Fourier series it is
better to choose a definition that measures the goodness of agreement between f and
F on the whole interval - 7T ~ X ~ 7T. This seems preferable, in particular if f has jumps:
F in Fig. 275 is a good overall approximation of f, but the maximum of If - FI (more
precisely, the supremum) is large (it equals at least half the jump of fat Xo). We choose
E =
(3)
J'" (f -
Fi dx .
-'"
This is called the square error of F relative to the function f on the interval -7T ~ X ~ 7T.
Clearly, E ~ O.
N being fixed. we want to determine the coefficients in (2) such that E is minimum.
Since (f - Ff = f2 - 2fF + F2, we have
E =
(4)
J'" f2 dx - 2 J'" fF dx + J'" F2 dx.
-~
-~
-~
We square (2), insert it into the I&<;t integral in (4), and evaluate the occurring integrals.
This gives integrals of cos 2 m: and sin2 1u (n ~ 1), which equal 7T, and integrals of
cos nx, sin 1Z:r. and (cos nx)(sin mx). which are zero (just as in Sec. 11.1). Thus
N(An cos
L7T'" F2 dx = L7T'" [Ao + ~I
llX
+ Bn sin nx)
J2 dx
We now insert (2) into the integral of fF in (4). This gives integrals of f cos nx as well
as f sin IU, just as in Euler's formulas, Sec. 1l.1, for an and bn (each multiplied by An
or Bn)' Hence
J'" fF dx
=
7T( 2A oao
+ AlaI + ...
+ ANaN
+ Bib i + ... +
-'"
With these expressions, (4) becomes
E =
J:J
2
[2A oao +
d, -
27T
7T [
2A02
(5)
+
+
~l (Anan + Bnbn) J
~I (An2 + B n2)J .
x
Fig. 275.
Error of approximation
BNbN)·
504
CHAP. 11
Fourier Series, Integrals, and Transforms
We now take An = an and Bn = b n in (2). Then in (5) the second line cancels half of the
integral-free expression in the first line. Hence for this choice of the coefficients of F the
square error, call it E*, is
(6)
We finally subtract (6) from (5). Then the integrals drop out and we get terms
An2 - 2Anan + an 2 = (An - a n )2 and similar terms (Bn - bn )2:
Since the sum of squares of real numbers on the right cannot be negative,
E - E*
~
0,
thus
E~E*,
and E = E* if and only if Ao = ao, ... , EN = bN . This proves the following fundamental
minimum property of the partial sums of Fourier series.
Minimum Square Error
THEOREM 1
The square error of Fill (2) (with fixed N) relative to f on the interval -7T ~ X ~ 7T
is millimum if alld ollly if the coefficients of Fill (2) are the Foltrier coefficients of
f. This millimllm vallle E* is givell by (6).
From (6) we see that E* cannot increase as N increases, but may decrease. Hence with
increasing N the partial sums of the Fourier series of f yield better and better
approximations to f, considered from the viewpoint of the square error.
Since E* ~ 0 and (6) holds for every N, we obtain from (6) the important Bessel's
inequality
(7)
for the Fourier coefficients of any function
f for which integral on the right exists. (For
F. W. Bessel see Sec. 5.5.)
It can be shown (see [eI2] in App. 1) that for such a function f, Parseval's theorem
holds; that is, formula (7) holds with the equality sign, so that it becomes Parseval's
identity4
(8)
4 MARC ANTOINE PARSEV AL (1755-1836), French mathematician. A physical interpretation of the identity
follows in the next section.
SEC. 11.6
505
Approximation by Trigonometric Polynomials
E X AMP L E 1
Minimum Square Error for the Sawtooth Wave
Compute the minimum square error E* of F(x) with N = 1, 2, ... , 10, 20, ... , 100 and 1000 relative to
f(x) = x
on the interval -
+
71'
71' ~ X ~ 71'.
1
Solution.
+
2 (sin x Sec. 11.3. From this and (6),
F(x) =
71'
'2
1
sin 2,
+
'3
(_l)N+l
sin 3x -
+ ... + -
~ sin Nx) by Example 3 in
Numeric values are:
N
E*
N
E*
N
E*
N
E*
1
2
3
4
5
8.1045
4.9629
3.5666
2.7812
2.2786
6
7
8
9
10
1.9295
1.6730
1.4767
l.3216
L.1959
20
30
40
50
0.6129
0,4120
70
80
90
100
1000
0.1782
0.1561
0.1389
0.1250
0.0126
0.3103
0.2488
0.2077
(iO
F = S1. S2, S3 are shown in Fig. 266 in Sec. 11.3, and F = S20 is shown in Fig. 276. Although l.r(x) - F(x)1
o
-IT
N
IT
X
Fig. 276. F with
= 20 in Example 1
is large at :+: 71' (how large?), where f is discontinuous, F approximates f quite well on the whole interval, except
near :+:71', where "waves" remain owing to the Gibbs phenomenon (see CAS Experiment 20 in Problem Set
11.2).
Can you think of functions f for which E* decreases more quickly with increasing N?
•
This is the end of our discussion of Fourier series, which has emphasized the practical
aspects of these series, as needed in applications. In the last three sections of this chapter
we show how ideas and techniques in Fourier series can be extended to nonperiodic
functions.
L:il
MINIMUM SQUARE ERROR
Find the trigonometric polynomial F(x) of the form (2) for
which the square error with respect to the given f(x) on the
interval - 7T ~ x ~ 7T is minimum, and compute the
minimum value for N = 1, 2.... , 5 (or also for larger
values if you have a CAS).
1. f(x) = x (-7T < X < 7T)
2. f(x) = x 2 (-7T < X < 7T)
3. f(x) = Ixl (-7T < x < 7T)
4. f(x) = .\'3 (-7T < X < 7T)
5.f(x)
6. f(x)
ISinxl(-7T<x<7T)
X < 7T)
e- 1xl (- 7T <
8. f(x) =
{
X
if
o
if
< x < !7T
!7T <
+ 7T) if -7T <
< x < 7T
9. f(x) = .r(x
if 0
-!7T
X
< ~7T
x < 0, f(x) = xC-x
+ 7T)
10. CAS EXPERIMENT. Size and Decrease of E*.
Compare the size of the minimum square error E* for
functions of your choice. Find experimentally the
factors on which the decrease of E* with N depends.
For each function considered find the smaIIest N such
that E* < 0.1.
if
-7T<X<O
11. (Monotonicity) Show that the minimum square error
if
O<X<7T
(6) is a monotone decreasing function of N. How can
you use this in practice?
7. f(x)
CHAP.11
506
Fourier Series, Integrals, and Transforms
[12-161 PARSEVAL'S IDENTITY
Usmg Parseval"s identity, prove that the series have the
indicated sums. Compute the first fev; partial sums to see
that the convergence is rapid.
12. L +
34
+
54
+
4
7T
7
4
+ ... = -
96
rr
1
"2 = 0.116850275
16
(Use Prob. 5. this set.)
= 1.014678032
15. J + -
I
(Use Prob. 15 in Sec. 11.1.)
4
I
3
7T
+ - 4 + ..
24
1.08232 3234
90
(Use Prob. 21 in Sec. 1l.1.)
13. L
1
+ -2 + 3
1
52
+ ...
=
~
1.23370 0550
8
16. I
+ - 6 + ... = -
56
7
3
(Use Prob. 9, this set.)
(Use Prob. 13 in Sec. 11.1.)
11.7
ILl
+ -6 + -
6
7T
960
= 1.001447078
Fourier Integral
Fourier series are powerful tools for problems involving functions that are periodic or are of
interest on a finite interval only. Sections 11.3 and L1.5 first illustrated this, and various further
applications follow in Chap. 12. Since, of course, many problems involve functions that are
nonperiodic and are of interest on the whole x-axis, we ask what can be done to extend the
method of Fourier series to such functions. This idea will lead to "Fourier integrals."
In Example I we stan from a special function fL of period 2L and see what happens
to its Fourier series if we let L ~ x. Then we do the same for an arbitral}' function fL
of period 2L. This will motivate and suggest the main result of this section, which is an
integral representation given in Theorem 1 (below).
E X AMP L E 1
Rectangular Wave
Consider the periodic rectangular wave fdx) of period 2L
J,N~
{;
> 2 given by
if
-£<x<-1
if
-I<x<
I<x< L.
if
The left part of Fig. 277 shows this function for 2L = 4, 8, 16 as well as the nonperiodic function f(x), which
we obtain from fL if we let L ~ x,
f(x) = lim
L-""
hex)
=
{I
if -I <x
0
II
-1
I
dx - £ '
I
an = £
II
-I
2
- dx = L
1171"X
cos -
I
otherwise.
We now explore what happens to the Fourier coefficients of fL as L
all n. For an the Euler formulas (6), Sec. 11.2. give
I
a - 0- 2L
<
L
increa~es.
II
0
Since fL is even, b n = 0 for
Il71"X
2 sin (1171"IL)
cos - - dx = - - - - £
L
1171"IL
This sequence of Fourier coefficients is called the amplitude spectrum of fL because lanl is the maximum
amplitude of the wave an cos (Ilm:lL). Figure 277 shows this spectrum fOf the periods 2L = 4, 8, 16. We see
that for increasing L these amplitude, become more and more dense on the positive wn-axis. where Wn = 1l71"1L.
Indeed, for 2£ = 4, 8, 16 we have I. 3, 7 amplitudes per "half-wave" of the function (2 sin wn)/(Lw n ) (dashed
in the figurel. Hence for 2L = 2k we have 2k - 1 - I amplitudes per half-wave. so that these amplitudes will
eventually be everywhere dense on the positive w.,-axis (and will decrease to zero).
The outcome of this example gives an intuitive impression of what about to expect if we turn from our special
•
function to an arbitrary one, as we shall do next.
SEC. 11.7
507
Fourier Integral
Waveform
fL
(x)
1 ,
rno,
Amplitude spectrum un(wn)
,
wn=nn/L
fn=5
\
x
~
,,
~
,
wn
,I"
n=3/
2L=4
,
n=:}
1
2
1'r\n=2
£n=lO
-,
x
'L
n=6/
r--2L=8~
fL(;;6
1
IE
W
n= 14
n
n=4
[1l;'I',uv 1-- . C r =20
n
1
-8
J ,..J
-'=1-_
1
x
8
0
2L= 16
4
n=12/
n=28/
Wn
"'I
___________f_(;;6~_______________________
-101
Fig. 277.
x
Waveforms and amplitude spectra in Example 1
From Fourier Series to Fourier Integral
We now consider any periodic function fL(X) of period 2L that can be represented by a
Fourier series
00
n7T
wn =
fdx) = ao + ~ (an cos WnX + bn sin wnx),
L
n=l
and find out what happens if we let L~ 00. Together with Example I the present calculation
will suggest that we should expect an integral (instead of a series) involving cos wx and
sin wx with W no longer restricted to integer multiples W = Wn = 117TIL of 7TIL but taking
all values. We shall also see what form such an integral might have.
If we insert an and bn from the Euler formulas (6), Sec. 1 1.2, and denote the variable
of integration by v, the Fourier series of fdx) becomes
We now set
ll.w =
Wn+l -
Wn
=
(n
+
L
1)7T
1l7T
7T
L
L
508
CHAP. 11
Fourier Series, Integrals, and Transforms
Then lIL = /1W/7T, and we may write the Fourier series in the form
(1)
fdx)
= -
1I
2L
L
iLlv) dv
1=[
+ - ~
(cos wnx) Llll'
7T n=l
-L
I
L
iLtv) cos WnV dv
-L
+ (sin wnx) .lw
f:/dV) sin wnv dVJ
This representation is valid for any fixed L, arbitrarily large, but finite.
We now let L ~ x and assume that the resulting nonperiodic function
f(x) = lim iLlx)
L_x
is absolutely integrable on the x-axis: that is, the following (finite!) limits exist:
(2)
lim
a~-x
IOlf(x)1 dx
a
+ lim fblf(x)1 dx
b~x
(written I_oo=lf(X) 1 dX) .
0
Then lIL ~ 0, and the value of the first term on the right side of (l) approaches zero.
Also LlW = 7T/L ~ 0 and it seems plausible that the infinite series in (l) becomes an
integral from 0 to Xl, which represents f(x), namely,
f(x)
(3)
=
-
I
7T
L°=[
cos wx I
=
f(v) cos wv dv
+ sin wx I
-x
=
f(v) sin wv dv
]
dw.
-00
If we introduce the notations
1
(4)
A(w)
7T
I
co
= - I
f(v) cos wv dv,
-cc
B(w)
co
= - I
7T
f(v) sin wv dv
-co
we can write this in the form
(5)
f(x)
=
LX lA(w) cos wx + B(w) sin wx] dw.
°
This is called a representation of f(x) by a Fourier integral.
It is clear that our naive approach merely suggests the representation (5), but by no
means establishes it; in fact. the limit of the series in (I) as Llw approaches zero is not
the definition of the integral (3). Sufficient conditions for the validity of (5) are as follows.
THEOREM 1
Fourier Integral
If f(x) is piecewise continllous (see Sec. 6.1) ill eve/}' finite interml and has a
right-hand derimtive alld a left-hand derivative at evel), point (see Sec ILl) and
call be represented by a Fourier imegral (5) with
A and B given by (4). At a point where f(x) is disconti1lllolis the value of the Fourier
integral equals the average of the left- and right-hand limits of f(x) at that point
(see Sec. 11.1). (Proof in Ref. [C 12]; see App. 1.)
if the integral (2) exists, then f(x)
SEC. 11.7
509
Fourier Integral
Applications of Fourier Integrals
The main application of Fourier integrals is in solving ODEs and PDEs, as we shall see
for PDEs in Sec. 12.6. However, we can also use Fourier integrals in integration and in
discussing functions defined by integrals, as the next examples (2 and 3) illustrate.
E X AMP L E 2
Single Pulse, Sine Integral
Find the Fourier integral representation of the function
C
f(x) =
Ixl <
loll>
if
if
---_~~:j
Fig. 278.
Solutioll.
I
(Fig. 278).
I
l--x
Example 2
From (4) we obtain
A(w)
= -I
7T
Joe f(v) cos
>l'U
dv = -
-x
Jl cos wv dv = -sin- - 11
I
II"V
7T
B(w) = -
mv
-1
Jl sin
I
7r
wU
-1
2 sin w
7n1'
dv = 0
-1
and (5) gives the answer
(6)
cos wx sin w
f(x) =
w
dw.
The average of the left- and right-hand limits of f(x) at x = I is equal to (I
Furthermore. from (6) and Theorem I we obtain (multiply by 7r12)
(7)
f
X
o
cos I n sin
11'
dll"
=
II"
r
if
+ 0)/2. that is. 111.
o ~x<
I.
=
1,
71/4
if
x
0
if
x> 1.
We mention that this integral is called Dirichlet's discontinous factor. (For P. L. Dirichlet see Sec. 10.8.)
The case x = 0 is of particular interest. If x = O. then (7) gives
co
(8*)
We see that this integral is the limit of the
1
o
sin II"
7r
--dw=-.
w
2
~o-called
sine integral
u
(8)
Si(lI)
=
1
sinw
--
o
dll'
W
II ~ x. The graphs of Si(lI) and of the integrand are shown in Fig. 279.
Tn the case of a Fourier series the graphs of the partial sums are approximation curves of the curve of the
periodic function represented by the series. Similarly, in the case of the Fourier integral (5). approximations are
obtained by replacing GO by numbers a. Hence the integral
as
(9)
21
7r
a
_c_os_w_x_s_in_'_v
0
approximares the right side in (6) and therefore f(x).
W
dw
510
CHAP.11
Fourier Series, Integrals, and Transforms
y
Integrand
1
~
1C
2
,;
Fig. 279.
Sine integral Situ} and integrand
Figure 280 shows o,cillations near the points of discontinuity of f(x). We might expect that these oscillations
disappear as a approaches infinity. But this is not tfile; with increasing a, they are shifted closer to the points
x = :!: I. Thi~ unexpected behavior. which also occurs in connection with Fourier series. is known as the Gibbs
phenomenon. (See also Problem Set 1l.2.) We can explain it by representing (9) in terms of sine integmls as
follows. Using (II) in App. A3.1. we have
2
-
I a cos wx sin w
7To
dll'
=
U'
-
I
I a sin (w + , "x)
1
+ -
U'
7To
In the fIrst integral on the right we set w +
dw
wx =
wx
7To
r. Then dw/w
=
0::<; t::<; (x + 1)1I. In the last integral we set w ~ -I. Then dw/w
0::<; t ::<; (x - I)a. Since sin (-t) = -sin t. we thus obtain
2
1T
I
a
.sm w
cos wx
0
I
dw=-
w
1T
I a sin (w - wx)
dt/t, and 0 ::<; w ::<; a corresponds to
= dt/t, and 0::<; w::<; a corresponds to
I(x+lla Sin. t
--dtt
0
dw.
l'\,'
1T
I
(X-lla smt
.
0
- - dt.
t
From this and (8) we see that our integral (9) equab
-
I
1T
Si(a[x
+
I]) - -
I
Si(lI[x - I])
1T
and the oscillations in Fig 280 result from those in Fig. 279. The increa~e of a amounts to a transformation
of the scale on the axis and causes the shift of the oscillations (the waves) toward the points of discontinuity
-1 and 1.
•
y
a= 16
2x
Fig. 280.
-2 -1 0
2x
-2 -1 0
The integral (9) for a = 8, 16, and 32
I
2x
SEC. 11.7
511
Fourier Integral
Fourier Cosine Integral and Fourier Sine Integral
For an even or odd function the Fourier integral becomes simpler. Just as in the case of
Fourier series (Sec. 1l.3), this is of practical interest in saving work and avoiding errors.
The simplifications follow immediately from the formulas just obtained.
Indeed. if f(x) is an evell function. then B(w) = 0 in (4) and
2
I
Alw) = -
(10)
:x:
feu) cos wu du.
0
7r
The Fourier integral (5) then reduces to the Fourier cosine integral
fO
f(x) =
(11)
A(w) cos
(f even).
WX {hI
o
Similarly, if f(x) is odd, then in (4) we have A(w) = 0 and
2
(12)
B(w)
L
GC
= -
f(u) sin wu du.
7r
0
The Fourier integral (5) then reduces to the Fourier sine integral
IXo Blw) sin \1'X dw
f(x) =
(13)
(f odd).
Evaluation of Integrals
Earlier in this section we pointed out that the main application of the Fourier integral is
in differential equations but that Fourier imegral representations also help in evaluating
certain integrals. To see this, we show the method for an important case, the Laplace
integrals.
E X AMP L E 3
Laplace Integrals
We shall derive the Fourier cosme and Fourier sine integrals of f(x) = e -kX, where x> 0 and k > 0 (Fig. 2lll).
The re~ult will be used to evaluate the so-called Laplace integrals.
Solutioll.
(a) From tIO) we have A(lI")
=
~ IXe7T
I
Fig. 281.
fIx)
Example 3
in
e -kv cos nov dv = -
2
k
kv
cos wv dv. Now. by integration by parts,
0
k
+w
If v = O. the expression on the right equals -kl(k 2
lero because of the exponential factor. Thus
2
e -lw
(
,
-lI" sm
k
-
II'V
+ cos wv )
.
+ w 2 ). If v approaches infinity. that expression approache~
(14)
By substituting this into (II) we thus obtain the Fourier cosine integral representation
L
:x:
2k
f(x) = e -k.-.; = -
7T
0
cos
2
k
liT
+w
2
dll"
(x> 0,
k> 0),
512
CHAP. 11
Fourier Series, Integrals, and Transforms
From this representation we see that
f
oo coswx
(15)
~
dw =
o k 2 +w 2
~ fooe-
(b) Similarly, from (12) we have B(w) =
7T
f
This equals -wl(k 2
.
e -kv slllwudu=-
+ ",2) if u
e-kx
2k
kv
(x> 0,
k > 0).
(x> 0,
k > 0).
sin wu du. By integration by parts
0
2
k
11'
(k-SIllWU+Coswu.
.
)
e -ku
2
+w
w
= 0, and approaches 0 as u ~ ce. Thus
(16)
From (13) we thus obtain the Fourier sine integral representation
L
00
2
k
f(x) = e - x = -
7T
11'sinwx
2
2 dw.
+w
k
0
From this we see that
=
w sin U'X
o k 2 + 11'2
L
(17)
-kx
7T
2
d11' =
e
•
The integrals (15) and (17) are called the Laplace integrals.
11-61
EVALUATION OF INTEGRALS
L
0::
Show that the given integral represents the indicated
function. Hint. Use (5), (11), or (13); the integral tells you
which one, and its value tells you what function to consider.
(Show the details of your work.)
if
0
00
1.
L
+ w sin xw dw =
1 + w2
cosxw
{ wl2
rre- x
x< 0
if
x= 0
if
x> 0
4.
o
oo
5.
L
r
•
smw
- cos xw dw
H'
cos (rrwI2)
l-w
o
2
if
O~x<
rr/4
if
x=
0
if
x>
cosxw dw
if
0 <
if
00
2.
L
•
Slnw-wcosw
w
o
2
sinxw dw
12
=
00
3.
L
cosxw
- - -2 dw
o 1 + w
rr
{
:0/4
- e - x if x
2
> 0
6.
if 0 < x < 1
=
if
x
if
x>
o::
L
sin rrw sin xw
1-
o
17-121
2
dw =
{¥
W
if
0
if
FOURIER COSINE INTEGRAL
REPRESENTATIONS
Represent j(x) as an integral (11).
7. f(x)
sin x
=
{
I
if
O<x<a
o
if
x>a
Ixl
Ixl
<
rr/2
~ 7T12
SEC. 11.8
2
8. f(x)
9. f(x)
10. f(x)
if
=r
0
=e
~f
=
O<x<a
if
16. f(x) =
x>a
if
0< x < 1
if
x > 1
if
xl2
17. f(x) =
o<x<
1 - x/2
if
<x<2
0
if
x>2
11. f(x) = rnx
0
12. f(x)
513
Fourier Cosine and Sine Transforms
{e~X
if
O<X<7T
if
X>7T
if
O<x<a
if
x>a
{I 0
x
19. f(x) =
0
FOURIER SINE INTEGRAL
REPRESENTATIONS
Represent f(x) as an integral (13).
14. f(x)
=
e
if
O<x<a
if
x> a
SlllX
15. f(x)
=
11.8
{
if
0
if
O<X<7T
if
x>
O<X<7T
r
X>
if
O<X<7T
if
~x
7T
7T
x>
if
O<x<a
if
x>a
20. PROJECT. Properties of Fourier Integrals
(a) Fourier cosine integral. Show that (11) implies
(al)
(a2)
f(ax) =
~
xf(x) =
fOA( :)
cos xw dw
(Scale change)
fOO B*(w) sin xw
dw,
o
replacing co with finite upper limits of your choice.
Compare the quality of the approximations. Write a
short report on your empirical results and observations.
114-191
O<x<
if
(a> 0)
13. CAS EXPERIMENT. Approximate Fourier Cosine
Integrals. Graph the integrals in Prob. 7, 9, and 11 as
functions of x. Graph approximations obtained by
if
if
{7T - x
ro~x
18. f(x) =
2
B*
(a3)
=
-
dA
dw '
A as in (10)
x 2 f(x) = f=A*(W) cosxw dw,
o
A* = -
d2A
dw 2
.
(b) Solve Prob. 8 by applying (a3) to the result of
Prob.7.
(c) Verify (a2) for f(x) = I if 0 < x < a and
f(x) = 0 if x > a.
(d) Fourier sine integral Find formulas for the
Fourier sine integral similar to those in (a).
Fourier Cosine and Sine Transforms
An integral transform is a transformation in the form of an integral that produces from
given functions new functions depending on a different variable. These transformations
are of interest mainly as tools for solving ODEs, PDEs, and integral equations, and they
often also help in handling and applying special functions. The Laplace transform
(Chap. 6) is of this kind and is by far the most important integral transform in
engineering.
The next in order of importance are Fourier transforms. We shall see that these
transforms can be obtained from the Fourier integral in Sec. 11.7 in a rather simple fashion.
In this section we consider two of them, which are real, and in the next section a third
one that is complex.
514
CHAP. 11
Fourier Series, Integrals, and Transforms
Fourier Cosine Transform
For an even function f(x), the Fourier integral is the Fourier cosine integral
(1)
(a)
f(x)
= t"'A(W) cos wx dw.
where
(b)
A(w)
o
2
= -
7T
[see (10), (11), Sec. 11.71. We now set A(w)
Then from (1 b), writing v = x, we have
L""
f(v) cos wv dv
0
= "'v'ij; icCw), where c suggests "cosine."
ac
~- L
icCw) =
(2)
f(x) cos wx dx
0
'iT
and from (la),
CIJ
(3)
f(x)
=
~L
-
7T
ie(w) cos
WX
dll'.
0
ATTENTION! In (2) we integrate with respect to x and in (3) with respect to 11". Formula
(2) gives from f(x) a new function ie(w), called the Fourier cosine transform of f(x).
Formula (3) gives us back f(x) from ie(w), and we therefore call f(x) the inverse Fourier
cosine transform of ie(w).
The process of obtaining the transform ie from a given f is also called the Fourier
cosine transform or the Fourier cosine transJoml method.
Fourier Sine Transform
Similarly, for an odd function f(x), the Fourier integral is the Fourier sine integral [see
(12), (13), Sec. 11.7]
L
2
CIJ
(4) (a)
f(x)
=
o
where
B(w) sin wx dw,
(b)
B(w)
=-
7T
L
ac
f(v) sin wv dv.
0
We now set B(w) = "'v'ij; isCw), where s suggests "sine:' Then from (4b), writing v = x,
we have
(5)
fsCw)
A
=
~
7T
L
CXJ
f(x) sin wx dx.
0
This is called the Fourier sine transform of f(x). Similarly, from (-I-a) we have
(6)
f(x) =
""A
L
~
-
7T
fs(w) sin wx dw.
0
This is called the inverse Fourier sine transform of is(w). The process of obtaining iAw)
from f(x) is also called the Fourier sine transform or the Fourier sine transJoT11111letlzod.
Other flotations are
and g;;;-l and 9F;! for the inverses of ;!Fe and 9Fs, respectively.
SEC. 11.8
Fourier Cosine and Sine Transforms
EXAMPLE
515
Fourier Cosine and Fourier Sine Transforms
Find the Fourier cosine and Fourier sine
tran~forms
x=a
x
fIx) in
Solution.
ifO<x<a
{k
j(x) =
Fig. 282.
of the function
(Fig. 282).
o
if x> a
From the definitions (2) and (5) we obtain by integration
Example 1
,
Ic(w) =
fa cos wx dt = -Vf2
aw )
-:;; k (sin
-w-
-Vf2
-:;; k
0
, If fa
Iiw) =
k
-
7T
sin ll'X dx
=
0
If
-
k (1 - cos
aw ) .
W
7T
This agrees with formulas 1 in the first two tables in Sec. 11.10 (where k = 1).
Note that for I(x) = k = const (0 < x < co). these transforms do not exist. (Why?)
E X AMP L E 2
•
Fourier Cosine Transform of the Exponential Function
Find ?Fc(e -x).
Solution.
By integration by parts and recursion.
-
?Fc(e x)
=
~f2
Leo
-7T
0 e
-1·
. cos wx dx =
f;
-7T
e-·l'
---2
l+w
This agrees with formula 3 in Table 1. Sec. 11.10. with 1I
=
(-cos wx + w sin wx)
1=
0
V2hr
= ---2
l+w
1. See also the next example.
•
What did we do to introduce the two integral transforms under consideration? Actually
not much: We changed the notations A and B to get a "symmetric" distribution of the
constant 2/7T in the original formulas (10)-(13), Sec. 11.7. This redistribution is a standard
convenience, but it is not essential. One could do without it.
What have we gained? We show next that these transforms have operational properties
that permit them to convert differentiations into algebraic operations (just as the Laplace
transform does). This is the key to their application in solving differential equations.
Linearity, Transforms of Derivatives
If f(x) is absolutely integrable (see Sec. 11.7) on the positive x-axis and piecewise
continuous (see Sec. 6.1) on every finite interval, then the Fourier cosine and sine
transforms of f exist.
Furthermore, if f and g have Fourier cosine and sine transforms, so does af + bg for
any constants a and b, and by (2),
oo
9F cCaf
+
bg)
~L
cc
=
~L
=
[af(x)
-
7T
a
+
bg(x)] cos
lIJX
dx
0
-
7T
f(x) cos wx d-1:
+
b
0
~oo
gCx) cos wx dx.
7T
L
0
The right side is a9Fc(f) + b2Fc(g). Similarly for 2Fs, by (5). This shows that the Fourier
cosine and sine transforms are linear operations,
(7)
(a)
'*c(af
+
bg)
= a9Fc(f) + b9FcCg),
(b)
9FsCaf
+
bg) = a9FsCf)
+
b9Fs(g).
CHAP. 11
516
THEOREM 1
I
Fourier Series, Integrals, and Transforms
Cosine and Sine Transforms of Derivatives
t'
Let f(x) be continuous and absolutely integrable on the x-axis, let (x) be piecewise
continuous on eve'}' finite interval, and let let f(x) ~ 0 as x ~ 00. Then
(a)
= w9's{f(x)} -
9' e{f' (x)}
(8)
9's{f'(x)}
(b)
PROOF
[f
f(O),
= -w9'e{f(x)}.
This follows from the definitions by integration by parts, namely,
=
[f I
7T
=
CC
-
f'(x) cos wx dx
0
~ [f(X) cos wx
= -
[f
f(O)
I:
+
w
f~ f(x) sin wx dxJ
+ w9's{f(x)};
and similarly,
, [fIX'
9'sff (x)} =
f (x) sin wx dx
-
7T
0
=
~ [f(X) sin wx
=
0 - w9'e{f(x)}.
I: - LX
w
f(x) cos wx dx ]
•
Formula (8a) with t' instead of f gives (when f', f" satisfy the respective assumptions
for f, J' in Theorem 1)
~e {
Q7;
f " (x)} =
Q7;'
w~s{f
(x)}
-
-V{2
-:;;. f '·0
( );
hence by (8b)
(9a)
Similarly,
(9b)
A basic application of (9) to PDEs will be given in Sec. 12.6. For the time being we
show how (9) can be used for deriving transforms.
SEC. 11.8
517
Fourier Cosine and Sine Transforms
E X AMP L E 3
An Application of the Operational Formula (9)
Find the Fourier cosine transform <;IFc(e -ax) of f(x) = e -ax, where a > O.
Solution.
By differentiation, (e- ax)" = a 2 e- ax ; thus
From this, (9a), and the linearity (7 a),
Hence
The answer is (see Table I, Sec. ll.lO)
(a> 0).
•
Tables of Fourier cosine and sine transfonns are included in Sec. 11.10.
I -10
1
12. Find the answer to Prob. 11 from (9b).
FOURIER COSINE TRANSFORM
< 2.
13. Obtain formula 8 in Table II of Sec. 11.1 I from (8b)
and a suitable formula in Table I.
2. Let f(x) = x if 0 < x < k, f(x) = 0 if x > k. Find
14. Let f(x) = sinx if 0 < x < 7T and 0 if x> 7T. Find
9's(f). Compare with Prob. 6 in Sec. 11.7. Comment.
1. Let f(x) = - I if 0 < x < L f(x)
f(x) = 0 if x > 2. Find ic(w),
= 1 if 1 <
x
Ic(w),
3. Derive formula 3 in Table 1 of Sec. 11.10 by integration.
4. Find the inverse Fourier cosine transform f(x) from the
answer to Prob. 1. Hint. Use Prob. 4 in Sec. 11.7.
5. Obtain 9';:-1(1/(1 + w 2 )) from Prob. 3 in Sec. 11.7.
6. Obtain 9';:-I(e- W ) by integration.
2
7. Find 9'c«(1 - X )-1 cos (7TX/2». Hint. Use Prob. 5 in
Sec. 11.7.
8. Let f(x) = x 2 if 0 < x < I and 0 if x> 1. Find 9'cCf).
9. Does the Fourier cosine transform of X-I sin x exist?
Of X-I cos x? Give reasons.
10. f(x) = 1 (0 < x < (0) has no Fourier cosine or sine
transform. Give reasons.
/11-201
FOURIER SINE TRANSFORM
11. Find 9's(e-"'-X) by integration.
15. In Table II of Sec. 11.10 obtain formula 2 from formula
4, using r@ = y.;;: [(30) in App. 3.1].
16. Show that 9'sCx- 1I2 ) = w- 1I2 by setting wx = t 2 and
using S(oo) = y:;;j8 in (38) of App. 3.1.
17. Obtain 9'sCe- ax ) from (8a) and formula 3 in Table I of
Sec. 11.10.
18. Show that 9's(x- 3/2 ) = 2w 1/2• Hint. Set wx = t 2 ,
integrate by parts, and use C(oo) = y:;;j8 in (38) of
App.3.1.
19. (Scale change) Using the notation of (5), show that
f(ax) has the Fourier sine transform (1/a)IsCw/a).
20. WRITING PROJECT. Obtaining Fourier Cosine
and Sine Transforms. Write a short report on ways
of obtaining these transforms, giving illustrations with
examples of your own.
518
11.9
CHAP. 11
Fourier Series, Integrals, and Transforms
Fourier Transform.
Discrete and Fast Fourier Transforms
The two transforms in the last section are real. We now consider a third one, called the
Fourier transform, which is complex. We shall obtain this transform from the complex
Fourier integral. which we explain first.
Complex Form of the Fourier Integral
The (real) Fourier integral is [see (4), (5), Sec. 11.7]
LX
[A(w) cos wx + B(w) sin wx]
o
f(x) =
(hI'
where
1
A(w)
= 7T
f
1
x
f(v) cos wv dv,
7T
-x
Suhstituting A and B into the integral for
1
f(x)
= -
B(w)
Lf
GC
= 7T
0
J=
f(v) sin wv dv.
-0<:
f, we have
x
f(v) Lcos wv cos
IVX
+
sin wv sin wx] dv dlV.
-x
By the addition formula for the cosine L(6) in App. A3.1] the expression in the brackets
[... ] equals cos (wv - wx) or, since the cosine is even, cos (wx - wv). We thus obtain
(1 *)
f(x)
L=[ f""
= -1
7T
0
f(v) cos (wx - wv) dv
]
dw.
-x
The integral in brackets is an even function of w. call it F(w). because cos (wx - wv) is
an even function of w, the function f does not depend on IV, and we integrate with respect
to v (not w). Hence the integral of F(w) from w = 0 to x is 1/2 times the integral of F(w)
from - x to x. Thus (note the change of the integration limit!)
(1)
f(x)
=
Lx= [ L::.l(V)
cos (wx -
1
= 27T
wv) dv
]
dw.
We claim that the integral of the form (1) with sin instead of cos is zero:
(2)
-27T f J
1
x
-0<:
[
00
f(v) sin (wx - wv) dv
]
dw
= O.
-x
This is true since sin (wx - IVV) is an odd function of lV, which makes the integral in
brackets an odd function of w, call it G(w). Hence the integral of G(W) from - x to ex; is
zero, as claimed.
We now take the integrand of (1) plus i (= -v=T) times the integrand of (2) and use
the Euler formula l( 11) in Sec. 2.2]
(3)
e ix
=
cos x
+ i sin x.
SEC 11.9
519
Fourier Transform. Discrete and Fast Fourier Transforms
Taking wx -
IrV
instead of x in (3) and multiplying by f(v) gives
+
f(v) cos (~~:\" - wv)
if(v) sin (wx - wv) = f(v)ei(WX-Wv)
Hence the result of adding (1) plus i times (2), called the complex Fourier integral, is
(4)
f f
27T
= -
f(x)
I
=:xl
-oc
.
-=
f(v)e'W(X-v) dv dw
(i
= v=i).
It is now only a very short step to our present goal, the Fourier transform.
Fourier Transform and Its Inverse
Writing the exponential function in (4) as a product of exponential functions, we have
1
f(x) = - -
(5)
yI2;
f [I
-f
yI2;
x
cc
-x
-cc
.
f(v)e-'WV dv
J.
e'wx dw.
The expression in brackets is a function of tV, is denoted by 1(».'), and is called the Fourier
transform of f; writing v = x, we have
A
few)
(6)
= -1~
f= f(x)e-'wx
. dx.
-x
With this, (5) becomes
I
f(x) = - -
(7)
yI2;
f= f(w)e'WX dw
A
•
-0:;
and is called the inverse Fourier transform of jew).
Another notation for the Fourier transform is
I
= ?F(f),
so that
The process of obtaining the Fourier transform ':!F(f) = I from a given f is also called
the Fourier transform or the Fourier transfon1l method.
Conditions sufficient for the existence of the Fourier transform (involving concepts
defined in Secs. 6.1 and 11.7) are as follows. as we state without proof.
THEOREM 1
Existence of the Fourier Transform
{tJCx) is absolutely integrable on the x-axis and piecewise continuous on every finite
interval. then tile Fourier transform ICw) of f(x) given by (6) exists.
520
E X AMP L E 1
CHAP. 11
Fourier Series, Integrals, and Transforms
Fourier Transform
Find the Fourier transfonn of f(x) = I if Ixl < 1 and f(x) = 0 otherwise.
Solution. Using (6) and integrating, we obtain
-
few)
I
= --
v'2;
Jl e-
iwx
1
e-
iwx
dx = - - . - - .-
v'2;
-1
-IW
As in (3) we have e iw = cos W + i sin w, e- iw = cos w - i sin w, and by subtraction
i
w -
e- iw
=
2; sin w.
Substituting this in the previous formula on the right, we see that i drops out and we obtain the answer
_
few)
E X AMP L E 2
"2 --;-.
•
,-:;; sin w
=
Fourier Transform
Find the Fourier transfonn '!F(e -ax) of f(x)
= e -ax
if X > 0 and f(x)
=
0 if x < 0; here a >
+
iw)
o.
Solution. From the definition (6) we obtain by integration
L
oo
m(
ere -ax)=
1
v27r
,~
e -axe -iwx dx
0
e-ca+iw)X
v'2;
-(a
+
iw)
v'2;(a
•
This proves fonnula 5 of Table III in Sec. 11.10.
Physical Interpretation: Spectrum
The nature of the representation (7) of f(x) becomes clear if we think of it as a superposition
of sinusoidal oscillations of all possible frequencies, called a spectral representation.
This name is suggested by optics, where light is such a superposition of colors
(frequencies). In (7), the "spectral density" jew) measures the intensity of f(x) in the
frequency interval between wand w + Aw (Aw small, fixed). We claim that in connection
with vibrations, the integral
f,o Ij(w)1
2
dw
-co
can be interpreted as the total energy of the physical system. Hence an integral of Ij(w)1 2
from a to b gives the contribution of the frequencies w between a and b to the total energy.
To make this plausible, we begin with a mechanical system giving a single frequency,
namely, the harmonic oscillator (mass on a spring, Sec. 2.4)
my"
+ ky =
o.
Here we denote time t by x. Multiplication by y' gives my' y"
!mv 2
+ ky' y = O. By integration,
+ !ky2 = Eo = const
where v = y' is the velocity. The first term is the kinetic energy, the second the potential
energy, and Eo the total energy of the system. Now a general solution is (use (3) in
Sec. 11.4 with t = x)
SEC 11.9
521
Fourier Transform. Discrete and Fast Fourier Transforms
w0
2
= kIm
iwoX
)/2, C l = CI = (al + ib l )/2. We write simply A
cle
,
B =
= A + B. By differentiation, v = y' = A' + B' = iWo(A - B).
Substitution of v and)' on the left side of the equation for Eo gives
where
=
(al - ib I
iWoX
cle• Then y
Here
Cl
Wo
2
= kim. as just stated: hence mwo2 = k. Also i2 = -I. so that
Hence the energy is proponional to the square of the amplitude
lell.
As the next step, if a more complicated system leads to a periodic solution y = f(x)
that can be represented by a Fourier series, then instead of the single energy term IeII2 we
get a series of squares Ienl 2 of Fourier coefficients Cn given by (6), Sec. 11.4. In this case
we have a "discrete spectrum" (or "point spectrum") consisting of countably many
isolated frequencies (infinitely many, in general), the corresponding Ienl 2 being the
contributions to the total energy.
Finally, a system whose solution can be represented by an integral (7) leads to the above
integral for the energy, as is plausible from the cases just discussed.
Linearity. Fourier Transform of Derivatives
New transforms can be obtained from given ones by
THEOREM 2
Linearity of the Fourier Transform
The Fourier transform is a linear operation; that is, for any functions f(x) and g(x)
whose Fourier transforms exist and any constants a and b, the Fourier transform
of af + bg exists, and
(8)
PROOF
g;(af
+ bg)
=
a'2F(f)
+
bgjP(g).
This is true because integration is a linear operation, so that (6) gives
I
gjP{af(x)
f
v 27T
+ bg(x)} = ~ ~
co
_
[af(x)
+ bg(x)]e-
tWX
dx
-::>0
f
yI2;
1
::>0.
= a --
f(x)e- tWX dx
-co
+
J g(x)e-
1::>0.
b --
yI2;
tWX
dx
-::>C
•
= agjP{f(x)} + b'2F{g(x)}.
In applying the Fourier transform to differential equations, the key property
differentiation of functions corresponds to multiplication of transforms by iw:
IS
that
522
CHAP. 11
THEOREM 3
Fourier Series, Integrals, and Transforms
Fourier Transform of the Derivative of I(x)
Let f(x) be continuous on the x-axis and f(x) ~ 0 as
f' (x) be absolutely integrable on the x-axis. Then
co.
Furthermore, let
~(f'(x)} = iw~{f(x)}.
(9)
PROOF
Ixl ~
From the definition of the Fourier transform we have
,
~{f (x)} =
1
--
IX,f (x)e- .
ZWX
yI2; -=
dx.
Integrating by parts, we obtain
~{f'(x)}
Since f(x) ~ 0 as
=
~
yI2;
[f(x)e-
loc -
iWX
-=
(-iw)
I
oo
-=
f(x)e- iwx dX] .
Ixl ~ ro, the desired result follows, namely,
~(f'(x)} =
+
0
iw~{f(x)}.
•
Two successive applications of (9) give
~(f") = iw~(f') = (iwf~(f).
Since (iW)2 = -w 2, we have for the transform of the second derivative of f
~{f"(x)}
(10)
-w2~{f(X)}.
=
Similarly for higher derivatives.
An application of (l0) to differential equations will be given in Sec. 12.6. For the time
being we show how (9) can be used to derive transforms.
E X AMP L E 3
Application of the Operational Formula (9)
Find the Fourier transform of xe
Solution.
-x'
from Table III. Sec 11.10.
We use (9). By formula 9 in Table III.
~(~e-X2) = ~{
.
+(e-X2)'}
= -
~ ~{(e-x'n
= -
-
I
2
ill' -
1
v'2
e- w
2}
4
•
SEC. 11.9
Fourier Transform. Discrete and Fast Fourier Transforms
Convolution
The convolution f * g of functions f
(11)
h(x)
=
(f
* g)(x) =
I
523
and g is defined by
oo
f(P)R(x - p) dp
=
t;
f(x - p)g(p) dp.
-x
-~
The purpose is the same as in the case of Laplace transforms (Sec. 6.5): taking the
convolution of two functions and then taking the transform of the convolution is the same
as multiplying the transforms of these functions (and multiplying them by \.1'2;):
THEOREM 4
Convolution Theorem
Suppose that f(x) and g(x) are piecewise continuous, bounded. and absolutely
intes;rable Oil the x-axis. Then
'?Jf(f
(12)
PROOF
* g)
\.1'2; '?Jf(f)'?Jf(g).
=
By the definition,
'?Jf(f
* g)
I
I I
\.1'2;
x
x
= --
-00
.
f(p)g(x - p) dp e- ZWX dx.
-00
An interchange of the order of integration gives
I I
\.1'2;
I
'?Jf(f
X:JC
* g) = - -
-x
.
f(p)g(x - p)e- ZWX d-..: dp.
-x
Instead of x we now take x - p = q as a new variable of integration. Then x = p
'?Jf(f
* g)
I I
\.1'2;
I
00
00
= --
-x
+ q and
.
f(p)g(q)e-ZW(p+q) dq dp.
-x
This double integral can be written as a product of two integrals and gives the desired
result
lOX:.
00
.
'?Jf(f * g) = - f(p)e- ZWP dp
g(q)e-·wq dq
\.1'2;
I
I
-cxo
-00
•
By taking the inverse Fourier transform on both sides of (12), writing j = '?Jf(f) and
as before, and noting that \.1'2; and l/\.I'2; in (12) and (7) cancel each other,
we obtain
g = '?Jf(g)
(13)
(f
* g)(X) =
{C j(w)g(w)e
iwx
dw,
-x
a formula that will help us in solving partial differential equations (Sec. 12.6).
524
CHAP. 11
Fourier Series, Integrals, and Transforms
Discrete Fourier Transform (OFT),
Fast Fourier Transform (FFT)
In using Fourier series, Fourier transforms, and trigonometric approximations (Sec. 11.6)
we have to assume that a function f(x), to be developed or transformed, is given on some
interval, over which we integrate in the Euler formulas, etc. Now very often a function
f(x) is given only in terms of values at finitely many points. and one is interested in
extending Fourier analysis to this case. The main application of such a "discrete Fourier
analysis" concerns large amounts of equally spaced data, as they occur in
telecommunication, time series analysis, and various simulation problems. In these
situations. dealing with sampled values rather than with functions. we can replace the
Fourier transform by the so-called discrete Fourier transform (DFT) as follows.
Let f(x) be periodic, for simplicity of period 27f. We assume that N measurements of
f(x) are taken over the interval 0 ~ x ~ 27f at regularly spaced points
27fk
(14)
k = 0, 1, ... , N - 1.
N'
We also say that f(x) is being sampled at these points. We now want to determine a
complex trigonometric polynomial
N-l
(15)
q(x) =
2:
inxk
cne
n~O
that interpolates f(x) at the nodes (14). that is. q(Xk) = f(Xk). written out, with fk denoting
f(Xk).
N-l
fk =
(16)
f(xk)
=
q(Xk)
=
2:
c,/nxk
k = 0, 1, ... , N - 1.
n~O
Hence we must determine the coefficients co, ... , CN - 1 such that (16) holds. We do this
by an idea similar to that in Sec. 11.1 for deriving the Fourier coefficients by using the
orthogonality ofthe trigonometric system. Instead of integrals we now take sums. Namely,
we multiply (16) by e- imxk (note the minus!) and sum over k from 0 to N - 1. Then we
interchange the order of the two summations and insert Xk from (14). This gives
N-l
(17)
~
£..J
k~O
f
N-IN-l
ke
-imxk _
~
~
iCn-m)xk _
- £..J £..J cne
N-l
N-l
~
~
n=O
k=O
iCn-m)27TkIN
- £..J Cn £..J e
k~O n~O
.
Now
We donote [ ... ] by r. For n = m we have r = eO = 1. The sum of these terms over k
equals N, the number of these terms. For n
m we have r
1 and by the formula for a
geometric sum [(6) in Sec. 15.1 with q = rand n = N - 1]
"*
1 - rN
N-l
2:
k=o
"*
rk
=
1-
=0
r
SEC. 11.9
525
Fourier Transform. Discrete and Fast Fourier Transforms
because r N = 1; indeed, since k, m, and n are integers,
rN
=
eiCn - m )27Tk
= cos 27Tk(n
- m)
+
i sin 27Tk(n - m)
= I + 0 = 1.
This shows that the right side of (17) equals cmN. Writing n for m and dividing by N, we
thus obtain the desired coefficient formula
(18*)
fk
=
f(Xk),
n = 0, 1, ... , N - 1.
Since computation of the Cn (by the fast Fourier transform, below) involves successive
halfing of the problem size N, it is practical to drop the factor lIN from Cn and define the
discrete Fourier transform of the given signal f = [fo
fN_I]T to be the vector
f = [io
iN-I] with components
in =
(18)
N-l
NC = ~ fke-inXk,
n
fk
=
f(Xk),
n
= 0, ... , N - 1.
k=O
This is the frequency spectrum of the signaL
In vector notation, f = FNf, where the N X N Fourier matrix FN = [enk] has the
entries [given in (18)]
(19)
where n, k = 0, ... , N - 1.
E X AMP L E 4
Discrete Fourier Transform (OFT). Sample of N
=
4 Values
Let N = 4 measurements (sample values) be given. Then w = e -2r.i/N = e -",i/2 = -i and thus w nk = (_i)nk.
Let the sample values be. say f = [0 I 4 9]T. Then by (18) and (19).
A
(20)
[ w'
wO
f = F4f =
wO
wO
wO
wO
wI
w2
w
w
2
3
w
w
4
::1
6
9
w6
w
f = [:
I
-i
-I
1
-I :1 [~1 [-41: 8i1
=
-[
-I
4
-6
-i
9
-4 - 8i
From the first matrix in (20) it is easy to infer what F N looks like for arbitrary N. which in practice may be
1000 or more, for reasons given below.
•
f = FNf we can recreate the given signal
= FIV 1 f, as we shall now prove. Here F N and its complex conjugate FN = ~ [wnk] satisfy
From the DFT (the frequency spectrum)
f
(21a)
where I is the N
(21b)
X
N unit matrix; hence FN has the inverse
-1 _
FN
-
1 N FN •
CHAP. 11
526
PROOF
Fourier Series. Integrals. and Transforms
We pr~ve (21). By the multiplication rule (row times col~n) the product matrix
G N = FNFN = [gjk] in (21a) has the entries gjk = Row j ofFN times Column k ofFN .
That is, writing W = wjw\ we prove that
= W O + WI
+ ... +
WN- 1 =
{~
if
j=l=-k
if
j = k.
Indeed, when j = k, then w\v k = (wwl = (e2TTiINe-2uiIN)k = 1k = L so that the sum
of these N tenns equals N; these are the diagonal entries of G N . Also, when j =f=. k, then
W =I=- 1 and we have a geometric sum (whose value is given by (6) in Sec. 15.1 with
q=Wandn=N-l)
WO
+
WI
+ ... +
WN-1
= 1 - WN
=
0
1- W
•
We have seen that f is the frequency spectrum of the signal f(x). Thus the components
in of f give a resolution of the 2 IT-periodic function f(x) into simple (complex) harmonics.
Here one should use only n's that are much smaller than N!2, to avoid aliasing. By this we
mean the effect caused by sampling at too few (equally spaced) points, so that. for instance,
in a motion picture, rotating wheels appear as rotating too slowly or even in the wrong sense.
Hence in applications, N is usually large. But this poses a problem. Eq. (18) requires O(N)
operations for any particular n, hence O(N2) operations for, say. alln < N!2. Thus, already
for 1000 sample points the straightforward calculation would involve millions of operations.
However, this difficulty can be overcome by the so called fast Fourier transform (FFT),
for which codes are readily available (e.g. in Maple). The FFT is a computational method
for the DFT that needs only O(N) log2 N operations instead of O(N2). It makes the DFT a
practical tool for large N. Here one chooses N = 2P (p integer) and uses the special fonn
of the Fourier matrix to break down the given problem into smaller problems. For instance.
when N = 1000, those operations are reduced by a factor lOOO/log2 1000 = 100.
The breakdown produces two problems of size M = N12. This breakdown is possible
because for N = 2M we have in (19)
The given vector f = [fo
fN_I]T is split into two vectors with M components
each, namely, fev = [fo f2
fN_2]T containing the even components of f, and
fod = [fl!3
fN_d T containing the odd components of f. For fev and fod we
determine the DFTs
r=
r
fev = fev.o
iev.2
~
fev.N-2
fOd = [iOd,1
iOd.3
f~od,N-l
A
[~
and
FMfev
= F Mfod
involving the same M X M matrix F M' From these vectors we obtain the components of
the DFT of the given vector f by the formulas
(22)
(a)
(b)
in+M
= iev,n
- wNniod.n
11
= 0,"', M - 1
11
= 0,"', M - 1.
SEC. 11.9
527
Fourier Transform. Discrete and Fast Fourier Transforms
For N = 2P this breakdown can be repeated p - 1 times in order to finally arrive at NI2
problems of size 2 each, so that the number of multiplications is reduced as indicated
above.
We show the reduction from N = 4 to M = NI2 = 2 and then prove (22).
E X AMP L E 5
Fast Fourier Transform (FFT). Sample of N = 4 Values
When N = 4. then
Consequently.
= WN =
W
fev
-i as in Example 4 and M
=
[/0]
A
= F2 f ev =
f2
= N12 = 2. hence It' = "'M = e-2m/ 2 = e--rr; =
[I I] [foJ
I
-I
-I.
[fo + 12J
=
fo - f2
f2
From this and (22a) we obtain
lo
II
lev.o + wN°lod.O = (fo + f2) + (fl + 13)
=
=
lev,1 +
1
H'N lod.l =
= fo
+ ft +,(2 + 13
efo - f2) - i(fl + f3) = fo - iIt - f2 + i13-
Similarly. by (22b).
A
0
A
f2
=
A
fev,o -
1
A
f3 =
A
fod,O =
II'N
A
f ev.l - wN f ud.l
=
efo + f2) - (fl + f3)
= fo -
(fo - f2) - (-i)(fl - f3)
=
This agrees with Example 4, as can be seen by replacing O. l. 4, 9 with
ft +
fo + if1
12 - f3
-
f2 - if3'
fo. ft, 12, /3,
•
We prove (22). From (8) and (19) we have for the components of the DFT
Splitting into two sums of M
fn =
=
NI2 terms each gives
M-I
~
.c..,
M-I
2kn
WN
f2k
+
~
.c..,
(2k+ 1m
WN
.f2k+l·
k=O
We now use
WN
2
=
WM
and pull out
n
WN
from under the second sum, obtaining
M-l
(23)
~
f- n =.c..,
k=O
M-I
knf
WM
eV,k
+
WN
n
~
.c..,
knf
WM
od,k'
k=O
The twO sums are f eV,n and f od.no the components of the "half-size" transforms F fev and
Ffod '
Formula (22a) is the same as (23). In (22b) we have 11 + M instead of n, This causes
a sign change in (23), namely -wN n before the second sum because
This gives the minus in (22b) and completes the proof.
•
CHAP. 11
528
Fourier Series, Integrals, and Transforms
-1. (Review) Show that 1Ii
e ix - e -;,,, = 2i sin x.
12-91
=
-i, e ix
+
e-
ix
=
xe- X
2 cos x,
8. f(x) =
{
o
if -1 < x < 0
otherwise
FOURIER TRANSFORMS BY INTEGRATION
Find the Fourier transform of f(x) (without using Table III
in Sec. ILl 0). Show the details.
-I
9. f(x)
01
=
if-l<x<O
if
{
2. f(x) =
{
e kX
if x < 0
o
ifx>O
0 < x <
otherwise
(k> 0)
OTHER METHODS
k ifO<x<b
3. f(x) = { 0
otherwise
e2iX
4. f(x)
=
{
if - I < x <
o
otherwise
e
if-l<x<
5. f(x) =
otherwise
if -I < x < 1
6. f(x) = {:
otherwise
ifO<x<
7. f(x) = {:
otherwise
10. Find the Fourier transform of f(x)
= xe- x
if x> 0 and
< 0 from formula 5 in Table III and (9) in the
text. Him: Consider xe- x and e- x .
o if x
11. Obtain '!-F(e-x"/2) from formula 9 in Table
m.
12. Obtain formula 7 in Table III from formula 8.
13. Obtain formula 1 in Table III from formula 2.
14. TEAM PROJECT. Shifting. (a) Show that if f(x)
has a Fourier transform, so does f(x - a), and
SC{f(x - a)} = e-iwaSC[f(x)}.
(b) Using (a), obtain formula 1 in Table III, Sec. l1.10,
from formula 2.
(e) Shifting on the w-Axis. Show that if j(lI') is the
Fourier transform of f(x), then J(w - a) is the Fourier
transform of eiaxf(x).
(d) Using (c), obtain formula 7 in Table TTTfrom 1 and
formula 8 from 2.
SEC 11.10
Tables of Transforms
529
11.10 Tables of Transforms
Table I.
Fourier Cosine Transforms
See (2) in Sec. 11.8.
f(x)
I
I
{~
X
< a
otherwise
2
xa-
3
e- ax
4
e- x2/2
5
e-ax"
6
xne- ax
(a> 0)
ro~,
ifO<x<a
7
1
(0
< a<
I)
(a> 0)
H7f
H7f
H
no) cos
wa
I
?]PeW
a7f
(na) see App. A3.1.)
2
((12: W2 )
e- w2/ 2
(0)
0)
otherwise
_I_ e- w2/(4a>
'\~
H
n!
(a 2
1
(W2
4a -
7f)
"4
1
C1,2
4a
"4
(a> 0)
V2c; cos
9
sin (llX 2)
(a> 0)
vTc; co~
smax
-x
(a> 0)
e- x sin x
Re (a
+ w2)n+l
_1_ [ sin a(l - w)
\12;
I-w
cos «(lX 2)
II
=
sinaw
w
8
\0
I
if 0 <
fet w )
+
+
+
Re =
Real part
iW)71+1
sin o( 1
+
+
w
I
w) ]
7f)
H
(See Sec. 6.3.)
(1 - lI(w - a»
I
2
- - arctan2
\12;
X
w
I
12
lo(ax)
(a> 0)
\ff
I
Va2 -
w
2 (l
- lI(w - 0))
(See Secs. 5.5, 6.3.)
530
CHAP. 11
Fourier Series. Integrals. and Transforms
Table II.
Fourier Sine Transforms
See (5) in Sec. 11.8.
I
J(x)
{~
1
ifO<x<a
otherwise
is(w)
H[
1 - cosaw ]
W
7f
2
1/~
1/,,;;'
3
1/J3/2
2~
4
xa-I
(O<a<l)
= ~s(J)
H
r(a) sin
w
7f
(!7f
a
(rca) see App. A3.1.)
2
I
e- ax
5
fI (
(a> 0)
V
e- ax
la > 0)
--
6
x
7
xne- ax
8
xe- x2/2
9
xe- ax2
10
{Si~X
(a> 0)
H
-
IT
V
7f
(a > 0)
if 0 <
11
cos (I\"
-x
12
2a
arctan -
X
(a
x
I
H.·
2
)
arctan -w
a
II!
(0 2
+
~1,2)n+ 1
1m (a
+
iw)n+l
1m =
Imaginary part
we- w2/2
< a
otherwise
I
w
+
7f
I
i
a2
7f
> 0)
(a > 0)
_~_v_ e-w2/4a
(2a)3/2
_1_ [ sin aU - w) _ sin aO + w) ]
yI2;
I-w
1+w
E
V 2
yI2;
(See Sec. 6.3.)
u(w - a)
sinh (/11'
w
e
-aw
SEC. 11.10
531
Tables of Transforms
Fourier Transforms
Table III.
See (6) in Sec. 1l.9.
---
I
i(x)
1
2
3
4
ifb<x<c
e- ibw
+a
{2X:
,
2
(0
=
ge(f)
sinbw
w
e- icw
_
iw\l2;
otherwise
I
x
7T
otherwise
{~
2
H
if -17 < x < b
C
jeW)
r;
> 0)
e-
\I 2
a1wl
a
ifO<x<b
b
+
-1
if b < x < 2b
ibw
2e
-
e-
\l2;w
2ibw
2
otherwise
5
r~~T
if x > 0
1
(a> 0)
\I2;(a
otherwise
6
7
8
9
10
r~x
e:
{e:
x
if b < x < c
eCa-iw)c _
e-a:il
sin ax
x
--
eCa-iw)b
V2-ii-Ca - iw)
otherwise
IT
\I
if -b <.r < b
sin b(w - a)
if b < x < c
eibCa-w) _
i
otherwise
V2;
(a> 0)
_1_
I
w-a
7T
otherwise
x
+ iw)
eicCa-w)
a-w
e-w2/4a
V2a
(a> 0)
--
H
if Iwl < a;
o iflwl
> a
532
CHAP. 11
Fourier Series, Integrals, and Transforms
--.................-. .. _ ......... =.....
........ _ • . - , ......... _ _ .. , .... _
_~
_.
........... _a iJ •
1. What is a Fourier series? A Fourier sine series? A
half-range expansion?
2. Can a discontinuous function have a Fourier series? A
Taylor series? Explain.
3. Why did we start with period 27f? How did we proceed
to functions of any period p?
4. What is the trigonometric system? Its main property by
which we obtained the Euler formulas?
TIONS AND PROBLEMS
[21-23J
Using the answers to suitable odd-numbered
problems, find the sum of
21. I - ~
+
22.
+
1·3
~
~
-
+
+
3·5
5·7
+ ...
23.1+b+~+
5. What do you know about the convergence of a Fourier
. ?
senes.
6. What is the Gibbs phenomenon?
7. What is approximation by trigonometric polynomials?
The minimum square error?
8. What is remarkable about the response of a vibrating
system to an arbitrary periodic force?
9. What do you know about the Fourier integral? Its
applications?
I
26. (Half-range expansion) Find the half-range sine series
of f(x) = 0 if 0 < x < 7f/2, f(x) = 1 if 7f/2 < x < 7f.
Compare with Prob. 12.
series of f(x) = x (0 < x < 27f). Compare with
Prob.20.
FOURIER SERIES
Find the Fourier series of f(x) as given over one period.
Sketch f(x). (Show the details of your work.)
if -1 < x < 0
{-:
11. f(x)
=
O<x<1
14. f(x)
if
29. For f(x)
7f/2 < x < 37f/2
X
< 27f)
~0-311
X
(-7f <
X
< 7f).
GENERAL SOLUTION
Solve y" + (lly = ret). where Iwl
is 27f-periodic and:
31. r(t) =
{2 - x
IS. f(x)
16. f(x) = { - I - x
1 - x
'* o. I. 2 . . . . . r(t)
=
< x < 3
if
if -1 < x < 0
if
0 < x < 1
Isin 8m:1 (-118 < x < 1/8)
19. f(x) = x 2 (-7f/2 < x < 7f/2)
=
(2
if -I < x < I
18. f(x) = eX (-7f < x < 7f)
20. f(x)
=
(-2<x<2)
x
17. f(x)
MINIMUM SQUARE ERROR
Compute the minimum square errors for the trigonometric
polynomials of degree N = I, ... , 8:
if -7f/2 < x < 7f/2
(-27f <
x
~8-291
28. For f(x) in Prob. 12.
if
C
12. f(x)
13. f(x)
2S. What are the sum of the cosine terms and the sum of
the sine terms in a Fourier series whose sum is f(x)?
Give two examples.
27. (Half-range cosine series) Find the half-range cosine
10. What is the Fourier sine transform? Give examples.
11l-20
24. (Parseval's identity) Obtain the result of Prob. 23 by
applying Parseval's identity to Prob. 12.
x (0 < x < 27f)
~2-371
FOURIER INTEGRALS AND
TRANSFORMS
Sketch the given function and represent it as indicated. If
you have a CAS, graph approximate curves obtained by
replacing ::to with finite limits; also look for Gibbs
phenomena.
32. f(x) = I if I < x < 2 and 0 otherwise, by a Fourier
integral
33. f(x) = x if 0 < x < 1 and 0 otherwise, by a Fourier
integral
Summary of Chapter 11
533
+ x/2
if -2 < x < o. f(x) = I - x/2 if
o < x < 2, ((x) = 0 othenvise, by a Fourier cosine
integral
35. f(x) = -I - x/2 if -2 < x < O. f(x) = 1 - x/2 if
o < x < 2, f(x) = 0 otherwise. by a Fourier sine
integral
36. f(x) = -4 + x 2 if -2 < x < 0, f(x) = 4 - x 2 if
o < x < 2, f(x) = 0 otherwise, by a Fourier sine
integral
34. f(x)
..
=
I
....
..
- - -- -___ .·."4"."...... _.......""''''-''---'-..... -..... ·_._ ......
....
--~
=4
- x 2 if -2 < x < 2. f(x)
a Fourier cosine integral
37. f(x)
= 0 otherwise. by
38. Find the Fourier transform of f(x) = k if
a < x < b. f(x) = 0 otherwise.
39. Find the Fourier cosine transform of f(x)
x > 0, f(x) = 0 if x < O.
=
e- 2x if
40. Find 9' c(e- 2x ) and 9' s(e- 2x ) by formulas involving
second derivatives
_-_
~
Fourier Series, Integrals, Transforms
Fourier series concern periodic functions f(x) of period p = 2L, that is. by definition
f(x + p) = f(x) for all x and some fixed p > 0; thus. f(x + IIp) = f(x) for any
integer 11. These series are of the form
117T
=
+ ~
f(x) = ao
(I)
( an cos -
n~l
+ bn
x
L
117T
X)
sin -
(Sec. 11.2)
L
with coefficients, called the Fourier coefficients of f(x), given by the Euler formulas
(Sec. 11.2)
- f
1
2L
(2)
L
11
= 1.
(l *)
2.
• ••.
1
f(x)
=
fL
117TX
f(x) cos - - dx
-L
L
117TX
f(x) sin - - dx
-L
L
L
27T
+~
ao
L
fL
= -
For period
1
= -
-L
bn
where
an
f(x) dx.
we simply have (Sec. 11.1)
(an cos nx
+ bn
sin I1X)
dx,
bn
71=1
with the Fourier coefficients of f(x) (Sec. L1.1)
I
ao =
27T
I
.,,-
L!(X) dx,
an
= -
7T
f
1
.,,-
_.,,-
f(x) cos
I1X
= -
7T
f
.,.,
f(x) sin nx dx.
_.,,-
Fourier series are fundamental in connection with periodic phenomena,
pm1icularly in models involving differential equations (Sec. 11.5, Chap. 12). If f(x)
is even [f( -x) = f(x)] or odd [f( - x) = - f(x)], they reduce to Fourier cosine or
Fourier sine series, respectively (Sec. 11.3). If f(x) is given for 0 ~ x ~ L only,
it has two half-range expansions of period 2L, namely, a cosine and a sine series
(Sec. 11.3).
534
CHAP.11
Fourier Series, Integrals, and Transforms
The set of cosine and sine functions in (I) is called the trigonometric system.
Its most basic property is its orthogonality on an interval of length 2L; that is, for
111 we have
all integers /11 and n
*
I
L
1117TX
117TX
cos - - cos - - dx
-L
L
L
and for all integers m and
=
I
0,
L
11l7TX
n 7[;r
sin - - sin - - dx
L
-L
L
=0
11.
I
L
ITI7rX
WlTX
L
L
cos - - sin - - dx = O.
-L
This 0l1hogonality was crucial in deriving the Euler formulas (2).
Partial sums of Fourier series minimize the square error (Sec. 11.6).
Ideas and techniques of Fourier series extend to non periodic functions f(x) defined
on the entire real line: this leads to the Fourier integral
(3)
f(x)
=
{O
+ B(w) sin wx]
[A(w) cos wx
dw
(Sec. 11.7)
o
where
1
(4)
A(w) = Tr
I_=
I
x
f(v) cos wv dv,
= -
B(w)
7r
I
x
f(v) sin wv dv
-GC
or, in complex form (Sec. 11.9),
(5)
1
f(x) = - -
V2;
IX f(w)e
A
twx
•
dw
-x
where
(6)
jew)
I
V2;
1
GC
.
f(x)e-tw:r dx.
= --
-x
Formula (6) transforms f(x) into its Fourier transform jew), and (5) is the inverse
transform.
Related to this are the Fourier cosine transform (Sec. 11.8)
(7)
je(w) =
p; I
x
-
7r
f(x) cos
H'X
dx
0
and the Fourier sine transform (Sec. 11.8)
(8)
A
fs(W)
=
p; I
-
7r
00
f(x) sin wx dx.
0
The discrete Fourier transform (DFT) and a practical method of computing it,
called the fast Fourier transform (FFT), are discussed in Sec. I 1.9.
CHAPTER
12
'f·
Partial Differential Equations
(PDEs)
PDEs are models of various physical and geometrical problems, arising when the unknown
functions (the solutions) depend on two or more variables, usually on time t and one or
several space variables. It is fair to say that only the simplest physical systems can be
modeled by ODEs, whereas most problems in dynamics, elasticity, heat transfer,
electromagnetic theory, and quantum mechanics require PDEs. Indeed, the range of
applications of PDEs is enormous, compared to that of ODEs.
In this chapter we concentrate on the most important PDEs of applied mathematics, the
wave equations governing the vibrating string (Sec. 12.2) and the vibrating membrane
(Sec. 12.7), the heat equation (Sec. 12.5), and the Laplace equation (Secs. 12.5, 12.10).
We derive these PDEs from physics and consider methods for solving initial and
boundary value problems, that is. methods of obtaining solutions satisfying conditions
that are given by the physical situation.
In Secs. 12.6 and 12.11 we show that PDEs can also be solved by Fourier and Laplace
transform methods.
COMMENT. Numerics for PDEs is explained in Secs. 21.4-21.7.
Prerequisites: Linear ODEs (Chap. 2), Fourier series (Chap. 11)
Sections that may be omitted ill a shorter course: 12.6, 12.9-12.11
References and Answers to Problems: App. 1 Part C, App. 2
12.1
Basic Concepts
A partial differential equation (PDE) is an equation involving one or more partial
derivatives of an (unknown) function, call it u, that depends on two or more variables,
often time t and one or several variables in space. The order of the highest derivative is
called the order of the PDE. As for ODEs. second-order PDEs will be the most imp0l1ant
ones in applications.
Just as for ordinary differential equations (ODEs) we say that a PDE is linear if it is
of the first degree in the unknown function u and its partial derivatives. Otherwise we call
it nonlinear. Thus, all the equations in Example 1 on p. 536 are linear. We call a linear
PDE homogeneous if each of its terms contains either u or one of its partial derivatives.
Otherwise we call the equation nonhomogeneous. Thus, (4) in Example I (with f not
identically zero) is nonhomogeneous, whereas the other equations are homogeneous.
535
CHAP. 12
536
E X AMP L E 1
Partial Differential Equations (PDEs)
Important Second-Order POEs
(I)
Olle-dimellsiollal WUl'e equation
(2)
One-dimensional heat equation
(3)
Two-dimensiollal Laplace equatioll
(4)
Two-dimensional Poissoll equatioll
(5)
(J211 =
ilt2
(6)
-
a2 u
iJx 2
c
2
2
(iJ U
ax 2
iJ2 11
+ -
iJy2
2
+ a
u)
Two-dimellsiollal wave equatioll
iJy2
(J2 u
+ -
iJ72
=0
Three-dimellsiollal Laplace equation
Here c is a posItive constant, t is time, x. y • .: are CartesIan coordinates, and dimensioll is the number of these
coordinates in the equation.
•
A solution of a PDE in some region R of the space of the independent variables is a
function that has all the partial derivatives appearing in the PDE in some domain D
(definition in Sec. 9.6) containing R, and satisfies the PDE everywhere in R.
Often one merely requires that the function is continuous on the boundary of R. has
those derivatives in the interior of R, and satisfies the PDE in the interior of R. Letting
R lie in D simplifies the situation regarding derivatives on the boundary of R, which is
then the same on the boundary as it is in the interior of R.
In general, the totality of solutions of a PDE is very large. For example, the functions
u
= eX cosy,
u = sin x cosh y,
which are entirely different from each other, are solutions of (3), as you may verify. We
shall see later that the unique solution of a PDE corresponding to a given physical problem
will be obtained by the use of additional conditions arising from the problem. For
instance, this may be the condition that the solution u assume given values on the boundary
of the region R ("boundary conditions"). Or, when time t is one of the variables, u (or
U t = uu/ut or both) may be prescribed at t = 0 ("initial conditions").
We know that if an ODE is linear and homogeneous, then from known solutions we
can obtain further solutions by superposition. For PDEs the situation is quite similar:
THEOREM 1
Fundamental Theorem on Superposition
If Ul and U2 are solutions of a homoge1leous linear PDE in some regioll R, then
with an.v constants
Cl
and C2 is also a solution of that PDE in the region R.
The simple proof of this imp0l1ant theorem is quite similar to that of Theorem I in
Sec. 2.1 and is left to the student.
SEC. 12.1
Basic Concepts
537
Verification of solutions in Probs. 14-25 proceeds as for ODEs. Problems 1-12 concern
PDEs solvable like ODEs. To help the student with them. we consider two typical
examples.
E X AMP L E 2
Solving
Uxx -
Find solutions
U
= 0 Like an ODE
of the PDE
II
Uxx -
It =
0 depending on x and y.
Soluti01l. Since no y-derivatives occur. we can solve this PDE like uTI - u = O. In Sec. 2.2 we would have
obtained u = Ae x + Be with constant A and B. Here A and B may be functions of y. so that the answer i,
-.l'
u(x,
A(y)e x + B(y)e-x
y) =
with arbitrary functions A and B. We thus have a great variety ot solution~. Check the result by differentiation. •
E X AMP L E 3
Solving u xy
Find solutions
= -UK Like an ODE
II =
lI(X.
y)
of this PDE.
Soluti01l. Setting "x = p. we have Py =
-p.
pylp = -I.
y
p = c!x)e- and by
lnp = -y + ('(x).
integration with respect to x,
u(x, yl
=
J(x)e- y
+ .!,'(y)
where
J(x) =
f
c(x) £Ix;
•
here, J(x) and g(y! are arbitrary.
--- -11-121
PDEs SOLVABLE AS ODEs
118-211
This happens if a PDE involves derivatives with respect to
one variable only (or can be transformed to such a form),
so that the other variable(s) can be treated as parameter(s).
Solve for u = u(x. y):
1.
U yy
+
16u
= 0
2. U.l : X = U
4. u y + 2yu
3. Uyy = 0
5. u y + U = e XY
7. u y = (cosh x)yu
9. y 2 u yy + 2yu y - 2u = 0
11.
u xy
12.
U yy
=
+
0
6. U xx = 4y 2 u
8. u y = 2xyu
10. Uyy = 4xl/y
Ux
lOuy
+
25u
=
e- 5y
13. (Fundamental
Theorem)
Prove
Fundamental
Theorem I for second-order PDEs in two and three
independent variables.
114-251
18.
It
2U.
It
-
Heat Equation (2) with suitable c
e- 2kt cos 8x
19.
II
e- w2t sin 4x
e- 4w2t sin wx
21.
U
e-w2c2t
122-251
cos wx
Laplace Equation (3)
22.
U
in (7) in the text
23.
U
cos 2y sinh 2x
24.
U
= arctan (ylx)
25.
U -
e
y2
x2
-
sin 2xJ
26. TEAM PROJECT. Verification of Solutions
(a) Wave equation. Verify that
u(.\". t) = vex + ell + w(x - el) with any twice
differentiable functions v and w satisfies (I).
(b) Poisson equation. Verify that each u satisfies (4)
with f(x. y) as indicated.
u
VERIFICATION OF SOLUTIONS
=
U =
Verify (by substitution) that the given function is a solution
of the indicated PDE. Sketch or graph the solution as a
surface in space.
u
X4
+
y4
cos.\ sin y
= .vlx
f
=
12(x 2
+
y2)
f = -2 cos x sin
f =
y
2)'lx 3
(c) Laplace equation. Verify that
[14-171
Wave Equation (1) with suitable c
+
14.
U -
4x 2
16.
U =
sin 3x sin 18t
(2
15.
U
sin 8x cos 2l
IIVx 2 + y2 + Z2 satisfies (6) and
u = In (x 2 + y2) satisfies (3). Is u =
17.
U =
sin kx cos ket
solution of (3)? Of what Poisson equation?
u =
l/Vx 2 + y2 a
538
CHAP. 12
Partial Differential Equations (PDEs)
(d) Verify that II with any (sufficiently often
differentiable) v and w satisfies the given PDE.
+
1I
= vex)
!I
= v(xhl"(y)
1/
= vex
+
equation (3) and determine a and b so that u satisfies
the boundary conditions 1I = 110 on the circle
x 2 + .1'2 = 1 and II = 0 on the circle.\"2 + y2 = 100.
II"(Y)
31)
12S-301
+
w( \" -
31)
27. (Boundary value problem) Verify that the function
u(x, y) = a In (x 2 + ,.2) + b satisfies Laplace's
12.2
SYSTEMS OF PDEs
Solve
28.
!Ix
29.
U"X
= 0,
!l xy
= 0
30.
l/xx
=
n,
llyy
=
= 0,
lly
= 0
n
Modeling: Vibrating String, Wave Equation
As a first important POE let us derive the equation modeling small transverse vibrations
of an elastic string, such as a violin string. We place the string along the x-axis, stretch it
to length L, and fasten it at the ends x = 0 and x = L. We then dist0l1 the string, and at
some instant. call it t = 0, we release it and allow it to vibrate. The problem is to determine
the vibrations of the string. that is, to find its deflection u(x, 1) at any point x and at any
time t > 0; see Fig. 283.
u(x, t) will be the solution of a POE that is the model of our physical system to be
derived. This POE should not be too complicated. so that we can solve it. Reasonable
simplifying assumptions Uust as for ODEs modeling vibration~ in Chap. 2) are as
follows.
Physical Assumptions
1. The mass of the string per unit length is constant ("homogeneous stl;ng"). The string
is perfectly elastic and does not offer any resistance to bending.
2. The tension caused by stretching the string before fastening it at the ends is so large
that the action of the gravitational force on the string (trying to pull the string down
a little) can be neglected.
3. The string performs small transverse motions in a ve11ical plane: that is, every particle
ofthe string moves strictly vel1ically and so that the deflection and the slope at every
point of the string always remain small in absolute value.
Under these assumptions we may expect solutions lI(X, t) that describe the physical
reality sufficiently well.
u
f3
p
Q:....",."'""'~~. T2
-....--',
r
r
r
r
r
r
o
x
Fig. 283.
x+ili:
L
Deflected string at fixed time t. Explanation on p. 539
SEC. 12.2
Modeling: Vibrating String. Wave Equation
539
Derivation of the PDE of the Model
(,lWave Equation") from Forces
The model of the vibrating string will consist of a PDE (""wave equation") and additional
conditions. To obtain the PDE, we consider the forces acting 011 a small portion of the
string (Fig. 283). This method is typical of modeling in mechanics and elsewhere.
Since the string offers no resistance to bending. the tension is tangential to the curve
of the string at each point. Let T] and T2 be the tension at the endpoints P and Q of that
portion. Since the points of the string move vertically, there is no motion in the horizontal
direction. Hence the horizontal components of the tension must be constant. Using the
notation shown in Fig. 283. we thus obtain
(1)
Tl cos
ll'
=
T2 cos
f3 =
T
=
COllst.
In the vertical direction we have two forces. namely, the vertical components - T] sin ll'
and T2 sin f3 of Tl and T2; here the minus sign appears because the component at P is
directed downward. By Newton's second law the resultant of these two forces is equal
to the mass p ~x of the portion times the acceleration a2uICJ(2, evaluated at some point
between x and x + ~x; here p is the mass of the un deflected string per unit length. and
~x is the length of the portion of the undeflected string. (~ is generally used to denote
small quantities; this has nothing to do with the Laplacian V2, which is sometimes also
denoted by ~.) Hence
Using 0), we can divide this by T2 cos
(2)
Now tan
ll'
and tan
tan
Tl sin
ll'
Tl cos
ll'
f3 =
Tl cos
ll'
= T, obtaining
= tan f3 - tan ll' =
p ~x
-T
f3 are the slopes of the string at x and x +
ll'
=
(~u)
I
dx
and
tanf3=
x
a2 u
0"(2
~x:
I
(-au)
ax
.
x+.lx
Here we have to write partial derivatives because u depends also on time (. Dividing (2)
b) ~x, we thus have
If we let
(3)
~x
approach zero, we obtain the linear PDE
p
This is called the one-dimensional wave equation. We see that it is homogeneous and
of the second order. The physical constant Tip is denoted by c 2 (instead of c) to indicate
540
CHAP. 12
Partial Differential Equations (PDEs)
that this constant is positive. a fact that will be essential to the form of the solutions.
"One-dimensional" means that the equation involves only one space variable. x. In the
next section we shall complete setting up the model and then show how to solve it by a
general method that is probably the most imp0l1ani one for PDEs in engineering
mathematics.
12.3
Solution by Separating Variables.
Use of Fourier Series
The model of a vibrating elastic string (a violin string, for instance) consists of the
one-dimensional wave equation
(1)
p
for the unknown deflection tI(x, t) of the string, a PDE that we have just obtained, dnd
some additiollal cOllditiolls, which we shall now derive.
Since the string is fastened at the ends x = 0 and x = L (see Sec. 12.2). we have the
two boundary conditions
(2)
~a)
ufO, t)
=
0,
(b)
ll(L, t)
=
0
for all
T.
FUl1hermore, the form of the motion of the string will depend on its initial deflection
(deflecrion ar time t = 0), call it f(x), and on its iniTiall'eiocity (velocity at t = 0). call
it g(x). We thus have rhe rwo initial conditions
(3)
(a)
l/(x. 0)
= f(x) ,
(b)
Ut(x. 0)
=
g(x)
(0 ~ x ~ L)
where lit = ilu/at. We now have to find a solution of the PDE (I) satisfying the conditions
(2) and (3). This will be the suI uti on of our problem. We shall do this in three steps, as
follows.
Step 1. By the "method of separating variables" or product meThod. setting
u(x. t) = F(x)G(t). we obtain from (l) two ODEs. one for F(x) and the other one for G(t).
Step 2. We determine solutions of these ODEs that satisfy the boundary conditions (2).
Step 3. FinaIl y, using Fourier series, we compose the solutions gained in Step 2 to obtain
a solution of (l) satisfying both (2) and (3), that is, the solution of our model of the
vibrating string.
Step 1. Two ODEs from the Wave Equation (1)
In the method of separating variables, or product method, we determine solutions of the
wave equation (I) of the form
(4)
ll(X, 1)
=
F(x)G(T)
SEC 12.3
Solution by Separating Variables. Use of Fourier Series
541
which are a product of two functions. each depending only on one of the variables x and
t. This is a powerful general method that has various applications in engineering
mathematics. as we shall see in this chapter. Differentiating (4), we obtain
and
where dots denote derivatives with respect to T and primes derivatives with respect to x.
By inserting this into the wave equation (1) we have
Dividing by c 2 FG and simplifying gives
The variables are now separated, the left side depending only on t and the right side only
on x. Hence both sides must be constant because if they were variable. then chdnging
t or x would affect only one side. leaving the other unaltered. Thus. say,
F"
- F =k .
Multiplying by the denominators gives immediately two ordinary DEs
F" - kF = 0
(5)
and
(6)
Here. the separation constant k is still arbitrary.
Step 2. Satisfying the Boundary Conditions (2)
We now determine solutions F and G of (5) and (6) so that u = FG satisfies the boundary
conditions (2), that is,
(7)
u(O, t)
=
We first solve (5). If G
and then by (7),
(8)
F(O)G(t)
0=
(a)
= 0,
0, then u = FG
F(O)
=
o.
u(L, t)
=0
=
F(L)G(t)
=
0
for all
T.
0, which is of no interest. Hence G "'" 0
(b)
F(L)
=
o.
We show that k must be negative. For k = 0 the general solution of (5) is F = ax + b,
and from (8) we obtain a = b = 0, so that F 0= 0 and II = FG 0= 0, which is of no interest.
For positive k = J..t2 a general solution of (5) is
542
CHAP. 12
Partial Differential Equations (PDEs)
and from (8) we obtain F ~ 0 as before (verify!). Hence we are left with the possibility
of choosing k negative, say, k = _p2. Then (5) becomes F" + p2F = 0 and has as a
general solution
F(x)
=
+B
A cos px
sin px.
From this and (8) we have
F(O)
We must take B
*-
=
=0
pL
=
F(L)
and then
0 since otherwise F
(9)
Setting B
A
~
= 117T,
= B sinpL = O.
O. Hence sinpL
= O. Thus
117T
p=-
so that
1, we thus obtain infinitely many solutions F(x)
(10)
Fn(x)
= sin
(ll integer).
L
=
F,l>:), where
117T
L
In
-x
= I, 2, .. ').
These solutions satisfy (8). [For negative integer 11 we obtain essentially the same solutions,
except for a minus sign, because sin (-ex) = -sin ex.]
We now solve (6) with k = _p2 = -(Il7TIL)2 resulting from (9), that is.
(11 *)
An = cp =
where
C177T
L
A general solution is
Hence solutions of (I) satisfying (2) are unlx,
t)
=
Fn(x)Gn(t)
=
Gn(t)Fn(X), written out
(11)
(11
= I. 2... ').
These functions are called the eigenfunctions, or characteristic junctions, and the values
A" = C177TIL are called the eigenvalues, or characteristic ralues, of the vibrating sHing.
The set {AI, A2 , ••• } is called the spectrum.
Discussion of Eigenfunctions. We see that each Un represents a harmonic motion having
the frequency A,,!27T = cnl2L cycles per unit time. This motion is called the 11th normal
mode of the string. The first normal mode is known as the fillldalllelltal mode (17 = 1),
and the others are known as overto17es; musically they give the octave, octave plus fifth,
etc. Since in (II)
.
177TX
L
Sll1 - -
=0
at
x=
L
2L
11
11
11-1
--L,
11
the nth normal mode has 11 - I nodes, that is, points of the string that do not move (in
addition to the fixed endpoints); :-.ee Fig. 284.
SEC. 12.3
Solution by Separating Variables. Use of Fourier Series
543
I~'J
o
L
n=l
n=2
Fig. 284.
n=4
n=3
Normal modes of the vibrating string
Figure 285 shows the second normal mode for various values of t. At any instant the
string has the form of a sine wave. When the left part of the string is moving down, the
other half is moving up, and conversely. For the other modes the situation is similar.
Tuning is done by changing the tension T. Our formula for the frequency AJ27T = cll!2L
of Un with c =
[see (3), Sec. 12.2] confirms that effect because it shows that the
frequency is proportional ro the tension. T cannot be increased indefinitely, but can you
see what to do to get a string with a high fundamental mode? (Think of both Land p.)
Why is a violin smaller than a double-bass?
VfiP
........-, ..,..
x
___ -'.0"
Fig. 285.
Second normal mode for various values of t
Step 3. Solution of the Entire Problem. Fourier Series
The eigenfunctions (II) satisfy the wave equation (l) and the boundary conditions (2)
(string fixed at the ends). A single Un will generally not satisfy [he initial conditions (3).
But since the wave equation (I) is linear and homogeneous, it follows from Fundamental
Theorem 1 in Sec. 12.\ that the sum of finitely many solutions Un is a solution of (I). To
obtain a solution that also satisfies the initial conditions (3), we consider the infinite series
(with An = Cl17TIL as before)
<Xl
(12)
u(x, t)
= ~
ex:
un(x, 1) = ~ (Bn cos Ant
n~l
117T
+ Bn * sin Ant) sin LX.
n~l
Satisfying Initial Condition (3a) (Given Initial Displacement).
we obtain
117T
<Xl
(\3)
From (12) and (3a)
u(x, 0) =~l Bn sin LX
= f(x).
Hence we must choose the Bn's so that u(x. 0) becomes the Fourier sine series of f(x).
Thus, by (4) in Sec. 11.3,
(14)
Bn
2
= -
L
fL f(x) sin --dx,
117TX
0
L
11
= 1,2,·· '.
544
CHAP. 12
Partial Differential Equations (PDEs)
Satisfying Initial Condition (3b) (Given Initial Velocity).
(12) with respect to t and using (3b). we obtain
117TX
co
.L
=
Similarly, by differentiating
Bn*An sin
L
=
g(x).
n~l
Hence we must choose the Bn*' s so that for t = 0 the derivative aulat becomes the Fourier
sine series of g(x). Thus, again by (4) in Sec. 11.3,
BrI *ArI
Since An =
CI117IL,
=
2
L
t
0
g(x) sin
L1117X
dx.
we obtain by division
BrI *
(15)
= -2-
t
CIl17
0
1117X
g(x) sin - - d\:,
11
L
= 1,2,···.
Result. Our discussion shows that u(x, t) given by (12) with coefficients (14) and (15)
is a solution of (I) that satisfies all the conditions in (2) and (3), provided the series (12)
converges and so do the selies obtained by differentiating (12) twice tennwise with respect
to x and t and have the sums a2 ulax 2 and a2 ulat 2 . respectively, which are continuous.
Solution (12) Established. According to our derivation the solution (12) is at first a
purely formal expression, but we shall now establish it. For the sake of simplicity we
consider only the case when the initial velocity g(x) is identically zero. Then the Bn* are
zero, and (12) reduces to
1117X
co
u(x, t) =
(16)
.L
C1117
Bn cos Ant sin - L - ,
L
n~l
It is possible to sum this series, that is. to wlite the result in a closed or finite form. For
this purpose we use the formula [see (l L), App. A3.I]
cos
1117
1
LC1117 t sin LX
= 2
[ sin {
L
1117
(x - ct) }
+ sin {
L
1117
(x
+ ct) } ]
Consequently, we may write (16) in the form
l/(x, t) =
co
21 .L
Bn sin
L
{ 1117
(x - ct)
}
+ 21
n~l
co
.L
Bn sin
(x
+ cO } .
n~l
These two series are those obtained by substituting x - ct and x
the variable x in the Fourier sine series (13) fOf j(x). Thus
(17)
L
{ 1117
u(x, t) = Hf*(x - ct)
+ f*(x + ct)]
+
ct, respectively, for
SEC 123
545
Solution by Separating Variables. Use of Fourier Series
where f* is the odd periodic extension of f with the period 2L (Fig. 286). Since the initial
deflection f(x) is continuous on the interval 0 ~ x ~ L and zero at the endpoints, it follows
from (17) that u(x, t) is a continuous function of both variables x and t for all values of
the variables. By differentiating (17) we see that u(x, t) is a solution of (1), provided f(x)
is twice differentiable on the interval 0 < x < L, and has one-sided second derivatives at
x = 0 and x = L, which are zero. Under these conditions u(x. t) is established as a solution
of (1), satisfying (2) and (3) with g(x) == O.
•
'J
Fig. 286.
J..............-
Odd periodic extension of {{x)
Generalized Solution. 1ft' (x) and f"(x) are merely piecewise continuous (see Sec. 6.1),
or if those one-sided derivatives are not zero, then for each t there will be finitely many
values of x at which the second delivatives of u appearing in (1) do not exist. Except at
these points the wave equation will still be satisfied. We may then regard u(x, t) as a
"generalized solution," as it is called, that is, as a solution in a broader sense. For instance,
a triangular initial deflection as in Example I (below) leads to a generalized solution.
Physical Interpretation of the Solution (17). The graph of f*(x - ct) is obtained from
the graph of f*(x) by shifting the latter ct units to the right (Fig. 287). This means thal
f*(x - ct) (c > 0) represents a wave that is traveling to the right as t increases. Similarly,
f*(x + ct) represents a wave that i:. traveling to the left. and u(x. t) is the superposition
of these two waves.
x
Fig. 287.
E X AMP L E 1
Interpretation of (l7)
Vibrating String if the Initial Deflection Is Triangular
Find the solution of the wave equation (1) cOlTesponding to the triangular initial deflection
2J..
f(x) =
~x
{ T(L - x)
L
O<x<-
if
2
L
-<x<L
if
2
and initial velocity zero. (Figure 288 shows f(x) = I/(X, 0) at the top.)
Solutioll. Since g(x) == 0, we have Bn * = 0 in (12). and from Example 4 in Sec. 11.3 we see that the Bn
are given by (5), Sec. 11.3. Thus (12) takes the fonn
I/(x, t) =
81.. [
7T2
I
)2
sin
I
L7T x cos L7TC t - )2
sin
37T
37TC
LX cos L
t
+ - ."
J.
For graphing the solution we may use u(x, 0) = fer) and the above interpretation of the two functions in the
representation (17). This leads to the graph shown in Fig. 2g8.
•
546
CHAP. 12
Partial Differential Equations (PDEs)
/s:
o
t=0
L
/
t = Ll5e
v'''--______'''.,. t = 2L15e
t = Ll2c
--
-t = 4L15c
,
,1
t = LIe
~2f*(X-L)
=!{"(x +L)
2
Fig. 288. Solution u(x, t) in Example 1 for various values of t (right part
of the figure) obtained as the superposition of a wave traveling to the
right (dashed) and a wave traveling to the left (left part of the figure)
11-10 I
DEFLECTION OF THE STRING
7.
Find u(x, t) for the string of length L = 1 and c 2 = 1 when
the initial velocity is zero and the initial deflection with
small k (say, 0.01) is as follows. Sketch or graph U(X, 1) as
in Fig. 288.
1. k sin
2. k(sin TTX -l sin
4. kx(l - X 2 )
27TX
3. h(l - x)
5.
37TX)
1
4
8.
ol
t/
1
4
1
"2
1
4
L
/'
1
0.1~._!
6
*b
4
1
-4
0.5
"
3
4
9.
"
It
vA,
1
4
1
"2
3
4
SEC. 12.3
Solution by Separating Variables. Use of Fourier Series
547
10.
2A(1 - cos
+
0.8
11. (Frequency) How does the frequency of the
fundamental mode of the vibrating string depend on
the length of the string? On the mass per unit length?
What happens to the string if we double the tension?
Why is a contrabass larger than a violin?
12. (Nonzero initial velocity) Find the deflection II(X, t)
of the string of length L = 7f and ("2 = I for zero
initial displacement and "triangular" initial velocity
II t (x, 0) = 0.0 I x if 0 :;;; x ~ ~7f, II t (X, 0) = 0.0 I (7f - x)
if ~7f ~ x :;;; 7f. (Initial conditions with lIix, 0) ~ 0 are
hard to realize experimentally.)
13. CAS PROJECT. Graphing Normal Modes. Write a
program for graphing lin with L = 7f and c 2 of your
choice similarly as in Fig. 284. Apply the program to
112, U3, 114' Also graph these solutions as surfaces over
thext-plane. Explain the connection between these two
kinds of graphs.
nTo(An
2
-
1I7f)
2
w )
.
sm wt.
Determine B" and Bn * so that II satisfies the initial
conditions u(x, 0) = f(x), uix, 0) = O.
(d) (Resonance) Show that if An = w, then
-
A
- - (1 n7fW
cos
1l7f)T
cos
WT.
(e) (Reduction of boundary conditions) Show that
a problem (1)-(3) with more complicated boundary
conditions 11(0, t) = 0, u(L, t) = h(t), can be reduced
to a problem for a new function v satisfying conditions
v(O, t) = v(L, f) = 0, vex. 0) = fl(x), Vt(x, 0) = gl(X)
but a nonhomogeneous wave equation. Him: Set
II = V + I\" and determine w suitably.
14. TEAM PROJECT. Forced Vibrations of an Elastic
String. Show the following.
(a) Substitution of
x
(17)
II(X,
t) =
Il7fX
L
GnU) sin
L
n=l
y
u
(L = length of the string) into the wave equation (I)
governing free vibrations leads to [see
••
(18)
Gn
2_
+
An G -
Fig. 189.
(lO'~)J
Elastic beam
SEPARATION OF A FOURTH-ORDER POE.
VIBRATING BEAM
0,
(b) Forced vibrations of the string under an external
force P(x, t) per unit length acting normal to the string
are governed by the PDE
By the prinCiples used in modeling the string it can be
shown that small free vertical vibrations of a uniform elastic
beam (Fig. 289) are modeled by the fourth-order PDE
(21)
(19)
(c) For a sinusoidal force P = Ap sin wt we obtain
p
p
n7fX
ex
=
A sin wt =
L
n~l
knCt) sin
L
'
(20)
_ {(4A11l7f)
k,,(t) -
sin wt
o
(11
odd)
(n
even).
Substituting (17) and (20) into (19) gives
••
Gn
+
2
_
An G n -
2A
-
f/7f
(l -
,,'
cos 1l7f) sm wt.
(Ref. [CII])
where c 2 = EIIpA (E = Young's modulus of elasticity,
I = moment of intertia of the cross section with respect to
the y-axis in the figure, p = density, A = cross-sectional
area). (Bending of a beam under a load is discussed in
Sec. 3.3.)
15. Substituting
II
=
F(x)G(t) into (21), show that
F(4)/F = -C/c 2 G =
13 4 =
conST,
+ B sin f3x
+ C cosh f3x + D sinh f3x,
F(x) = A cos f3x
G(l)
CHAP. 12
548
Partial Differential Equations (PDEs)
.f
=-=:l1..
~
x=o
x=L
+=
(A) Simply supported
(E) Clamped at both
ends
x=L
(e) Clamped at the left
I
I
x=L
x=o
Fig. 290.
end, free at the
right end
Supports of a beam
16. (Simply supported beam in Fig. 290A) Find solutions
un = Fn(x)Gn(t) of (21) corresponding to zero initial
velocity and satisfying the boundary conditions (see
Fig. 290A)
u(O, t) = 0, u(L, t) = 0
(ends simply supported for all times t),
uxx(O, t) = 0, uxx(L, t) = 0
(zero moments, hence zero curvature, at the ends).
17. Find the solution of (21) that satisfies the conditions in
Prob. 16 as well as the initial condition
u(x, 0) = f(x) = x(L - x).
12.4
18. Compare the results of Probs. 17 and 3. What is the
basic difference between the frequencies of the
normal modes of the vibrating string and the vibrating
beam?
19. (Clamped beam in Fig. 290B) What are the boundary
conditions for the clamped beam in Fig. 290B? Show
that F in Prob. 15 satistles these conditions if {3L is a
solution of the equation
(22)
cosh {3L cos {3L = I.
Determine approximate solutions of (22), for instance,
graphically from the intersections of the curves of
cos {3L and lIcosh {3L.
20. (Clamped-free beam in Fig. 290C) If the beam is
clamped at the left and free at the right (Fig. 290C),
the boundary conditions are
u(O, t) = 0,
uxx(L, t)
=
0,
uxxx(L, t)
=
O.
Show that F in Prob. 15 satisfies these conditions if {3L
is a solution of the equation
cosh {3L cos (3L
(23)
=
-
1.
Find approximate solutions of (18).
D'Alembert's Solution
of the Wave Equation.
Characteristics
It is interesting that the solution (17), Sec. 12.3, of the wave equation
(1)
p
can be immediately obtained by transforming (1) in a suitable way, namely, by introducing
the new independent variables
(2)
v = x + ct,
w
= x - ct.
Then u becomes a function of v and w. The delivatives in (1) can now be expressed in
terms of delivatives with respect to v and w by the use of the chain rule in Sec. 9.6.
Denoting partial delivatives by subscripts, we see from (2) that Vx = I and Wx = I. For
simplicity let us denote u(x, t), as a function of v and w, by the same letter u. Then
SEC 12.4
D'Alembert's Solution of the Wave Equation. Characteristics
549
We now apply the chain rule to the right side of this equation. We assume that all the
partial derivatives involved are continuous, so that U wv = uvw ' Since Vx = I and Wx = 1,
we obtain
Transforming the other derivative in (1) by the same procedure. we find
By inserting these two results in (I) we get (see footnote 2 in App. A3.2)
(3)
=
awav
o.
The point of the present method is that (3) can be readily solved by two successive
integrations, first with respect to wand then with respect to v. This gives
all
av
= h(v)
and
= fh(V)
u
dv
+
I/J(w).
Here h(v) and I/J(w) are arbitrary functions of v and w, respectively. Since the integral is
a function of v, say, cfJ(v). the solution is of the form u = cfJ(v) + I/J(w). In tem1S of
x and t, by (2), we thus have
(4)
u(x, t)
=
cfJ(x
+
+
ct)
I/J(x -
ct).
Thi~ is known as d' Alembert's solution l of the wave equation (1),
Its derivation was much more elegant than the method in Sec. 12.3. but d'Alembert's
method is special, whereas the use of Fourier series applies to various equations, as we
shall see.
D'Alembert's Solution Satisfying the Initial Conditions
(5)
(a)
U(X. 0)
= f(x).
(b)
Ut(X. 0)
=
g(x).
These are the same as (3) in Sec. 12.3. By differentiating (4) we have
(6)
Ut(x, t)
= ccfJ' (x + ct)
- cl/J' (x -
ct)
IJEAN LE RONO O'ALEMBERT (1717-1783). French mathematician, also known for his important work
in mechanics.
We mention that the general theory of POEs provides a systematic way for rmding the transformation (2) that
simplifies (I). See Ref. [e8] in App. I.
sso
CHAP. 12
Partial Differential Equations (PDEs)
where primes denote delivatives with respect to the entire arguments x + ct and x - ct,
respectively, and the minus sign comes from the chain rule. From (4}-(6) we have
+
(7)
U(x, 0)
=
<p(x)
(8)
Ut(X, 0)
=
C<p' (x) - crJ/ (x)
",(x)
=
f(x),
= g(x).
Dividing (8) by c and integrating with respect to x. we obtain
I
(9)
<p(x) - "'(x)
= k(xo) + -
c
I
x
g(s) ds,
Xo
If we add this to (7). then '" drops out and division by 2 gives
I
(10)
= -
<p(X)
2
f(x)
+ -
I
~
I
I
x
g(s) ds
~
+ -
k(xo)·
2
Similarly, subtraction of (9) from (7) and division by 2 gives
",(x) = -1 f(x) -
(11)
2
-I
2c
r
-I k(xo).
g(s) ds -
2
Xo
In (10) we replace x by x + ct; we then get an integral from Xo to x + ct. In (11) we
replace x by x - ct and get minus an integral from Xo to x - ct or plus an integral from
x - ct to xo. Hence addition of <p(x + ct) and "'(x - ct) gives u(x, t) [see (4)] in the form
1
(12)
u(x. t)
= -
2
[J(x
+ ct) + f(x
- ct)]
+-
1
2c
I
x+ct
g(s) ds.
x-ct
If the initial velocity is zero. we see that this reduces to
(13)
u(x, t)
=
Uf(x
+ ct) + f(x
- ct)],
in agreement with (17) in Sec. 12.3. You may show that because of the boundary conditions
(2) in that section the function f must be odd and must have the period 2L.
Our result shows that the two initial conditions [the functions f(x) and g(x) in (5)]
determine the solution uniquely.
The solution of the wave equation by the Laplace transform method will be shown in
Sec. 12.11.
Characteristics. Types and Normal Forms of PDEs
The idea of d' Alembert's solution is just a special instance of the method of
characteristics. This concems PDEs of the form
(14)
SEC. 12.4
sst
D'Alembert's Solution of the Wave Equation. Characteristics
(as well as PDEs in more than two variables). Equation (14) is called quasilinear because
it is linear in the highest delivatives (but may be arbitrary otherwise). There are three
types of PDEs (14), depending on the discliminant AC - B2, as follows.
Type
Defining Condition
Hyperbolic
AC - B2 < 0
Wave equation (1)
Parabolic
AC - B2 = 0
Heat equation (2)
Elliptic
AC - B2 > 0
Laplace equation (3)
Example in Sec. 12.1
Note that (I) and (2) in Sec. 12.1 involve t, but to have y as in (14), we set y = ct in
(1). obtaining Utt - c 2 u xx = c 2 (u yy - uxx ) = O. And in (2) we set .r = c 2 t. so that
Ut c 2 U= -_ C 2(u y - Ux~') .
A, B, C may be functions of x, y, so that a PDE may be of mixed type, that is, of
different type in different regions of the xy-plane. An important mixed-type PDE is the
Tricomi equation (see Prob. 10).
Transformation of (14) to Normal Form. The normal forms of (14) and the
corresponding transformations depend on the type of the PDE. They are obtained by
solving the characteristic equation of (14), which is the ODE
Ay'2 - 2By' + C = 0
(15)
where y' = dy/dx (note -2B, not +2B). The solutions of (I5) are called the characteristics
of (14), and we write them in the form (I\x, y) = const and 'l'(x, y) = const. Then the
transformations giving new variables v, winstead of x, y and the normal fonns of (14)
are as follows.
Type
New Variables
Normal Form
Hyperbolic
v=<fJ
w='l'
uvw = Fl
Parabolic
v=x
w=<fJ='l'
U ww
Elliptic
V
= ~(<fJ
+ 'l')
1
tv
= 2i (<fJ - 'l')
= F2
uvv + uww
=
Fg
Here, <I) = (I\x, y), 'l' = 'l'(x, y), Fl = F1(v, w, u, u v , uw ), etc., and we denote u as
function of v, w again by u, for simplicity. We see that the normal form of a hyperbolic
PDE is as in d' Alembert's solution. In the parabolic case we get just one family of solutions
<I) = 'l'. In the elliptic case, i = yq, and the characteristics are complex and are of
minor interest. For derivation, see Ref. [GR3] in App. 1.
EX AMP LEt
D'Alembert's Solution Obtained Systematically
The theory of characteristics gives d' Alembert's solution in a systematic fashion. To see this, we write the wave
2
equation Utt - c u xx ~ 0 in the form (14) by setting Y ~ ct. By the chain rule, Ut = uyYt = cU y and
2
2
Htt = C Hyy. Division by c gives H.= - Uyy = 0, as stated before. Hence the characteristic equation is
y'2 - I = (y' + 1)(/ - 1) = O. The two families of solutions (characteristics) are <1>(x, y) = )' + x = const
and 'I'(x, y) = y - x = const. This gives the new variables v = <1> = y + x = ct + x and
w = 'I' = y - x = ct - x and d'Alembert's solution u = h(x + ct) + f2(X - ct).
•
CHAP. 12
552
-- =-
..
Partial Differential Equations (PDEs)
-
-...... .-. ......w................ ___ . _
1. Show that c is the speed of each of the two waves given
by (4).
2. Show that because of the boundaty conditions (2).
Sec. 12.3, the function f in (13) of this section must
be odd and of period 2L.
3. If a steel wire 2 m in length weighs 0.9 nt (about 0.20 lb)
and is stretched by a tensile force of 300 m (about 67.4
Ib). what is the corresponding speed of transverse waves?
4. What are the frequencies of the eigenfunctions in
Prob.3?
5. Longitudinal Vibrations of an Elastic Bar or Rod.
These vibrations in the direction of the x-axis are
modeled by the wave equation Utt = r 2 u xx , c 2 = EI P
(see Tolstov [C9]. p. 275). If the rod is fastened at one
end. x = 0, and free at the other. x = L. we have
11(0. t) = 0 and II.AL, t) = O. Show that the motion
corresponding to initial displacement u(x, 0) = f(x)
and initial velocity zero is
=
II =
2:
An sin Pnx cos Pnct,
n=O
2
L
An = -
12.5
fL f(x) sin
0
(211
PnX dx,
Pn =
+
2L
1)7T
/6-91
GRAPHING SOLUTIONS
using (13), st...etch or graph a figure (similar to Fig. 288 in
Sec. 12.3) of the deflection u(x. t) of a vibrating string
(length L = L ends fixed, c = I) starting with initial
velocity 0 and initial deflection (k small, say, k = 0.0 I).
6. f(x)
k sin
8.
kx(l - x)
.f(x)
7. f(x)
TTX
cos
= k(l -
27TX)
10. (Tricomi and Airy equations2 ) Show that the Tricomi
equatioll YU xx + lIyy = 0 is of mixed type. Obtain the
Airy equation G" - yG = 0 from the Tricomi equation
by separation. (For solutions, see p. 446 of Ref. [GR I]
listed in App. I.)
/11~01
NORMAL FORMS
Find the type, transform to normal form. and solve. (Show
the details of your work.)
11.
llxy -
13.
U xx
15.
U=
17.
u"", -
+
+
19. XlIxx -
Heat Equation: Solution
U yy
=
12. l"~x
0
9uyy
=0
2uxy
+
4uxy +
Xl/xy
=
Uyy
211xy
lI XY -
lI~.x
= 0
16.
XU"lI -
=
0 18. uxx
411yy
0
-
+
14.
20.
+
lixx -
+
yU yy =
2u xy
4uxy
U yy
2u yy
=
0
=0
0
+ 5uyy
+ 3l1 yy
= 0
=
0
by Fourier Series
From the wave equation we now turn to the next "'big" PDE, the heat equation
K
up
which gives the temperature u{x, y, ~, t) in a body of homogeneous material. Here c 2 is
the thermal diffusivity, K the thermal conductivity, u the specific heat, and p the density
of the material of the body. V2 11 is the Laplacian of u, and with respect to Cartesian
coordinates x, y, .:::,
The heat equation was derived in Sec. 10.8. It is also called the diffusion equation.
As an important application. let us first consider the temperature in a long thin metal
bar or wire of constant cross section and homogeneous material, which is oriented along
the x-axis (Fig. 291) and is perfectly insulated laterally. so that heat flows in the x-direction
2SIR GEORGE BIDELL AIRY (1801-1892), English mathematician, known for his work in elasticity.
FRANCESCO TRICOMI (1897-1978), Italian mathematIcian, who worked in integral equations.
SEC. 12.5
Heat Equation: Solution by Fourier Series
553
o
x=L
Fig. 291.
Bar under consideration
only. Then u depends only on x and time t. and the heat equation becomes the
one-dimensional heat equation
(1)
This seems to differ only very little from the wave equation, which has a term lttt instead
of LIt, but we shall see that this will make the solutions of (1) behave quite differently
from those of the wave equation.
We shall solve (1) for some important types of boundary and initial conditions. We
begin with the case in which the ends x = 0 and x = L of the bar are kept at temperature
zero, so that we have the boundary conditions
(2)
u(O, t)
= 0,
u{L, t)
=0
Furthermore, the initial temperature in the bar at time t
have the initial condition
(3)
u(x, 0)
=
for all t.
= 0 is given. say, f(x). so that we
f(.t)
[f(x) given].
Here we must have teO) = 0 and ttL) = 0 because of (2).
We shall determine a solution /I(x, t) of (I) satisfying (2) and (3)-one initial condition
will be enough, as opposed to two initial conditions for the wave equation. Technically.
our method will parallel that for the wave equation in Sec. 12.3: a separation of variables.
followed by the use of Fourier series. You may find a step-by-step comparison worthwhile.
Step 1. Two ODEs from the heat equation (1). Substitution of a product
u(x, t) = F{x)G(t) into (I) gives FG = c 2F"G with G = dG/dt and F" = d 2Fldr;2. To
separate the variables, we divide by c 2 FG, obtaining
(4)
G
F"
F
The left side depends only on t and the right side only on x, so that both sides must equal
a constant k (as in Sec. 12.3). You may show that for k = 0 or k > 0 the only solution
u = FG satisfying (2) is u == O. For negative k = _p2 we have from (4)
Multiplication by the denominators gives immediately the two ODEs
(5)
554
CHAP. 12
Partial Differential Equations (PDEs)
and
(6)
Step 2. Satisfying the boundary conditions (2). We first solve (5). A general solution is
Ex)
(7)
=
A cos px
+ B sin px.
From the boundary conditions (2) it follows that
lI(O. t)
=
F(O)G(t)
=0
u(L, t) = F(L)G(t) = O.
and
Since G == 0 would give LI == O. we require F(O) = O. F(L) = 0 and get F(O) = A = 0
by 0) and then F(L) = B sinpL = 0, with B =1= 0 (to avoid F == 0); thus,
sinpL
Setting B
=
= 0,
hence
II
= I. 2, ....
1, we thus obtain the following solutions of (5) satisfying (2):
117TX
Fn(x) = sin
L'
II =
1,2.....
(As in Sec. 12.3, we need not consider negative integral values of Il.)
All this was literally the same as in Sec. 12.3. From now on it differs since (6) differs
from (6) in Sec. 12.3. We now solve (6). For p = Il7TIL, as just obtained, (6) becomes
G + An 2G =
0
An =
where
CIl7T
L
It has the general solution
= I, 2, ...
11
where Bn is a constanl. Hence the funclions
(8)
L1 n(X, t)
=
Fn(x)Gn(t)
=
Bn sin
Il7TX
L
2
e- An
t
(n
= 1, 2, ... )
are solutions of the heat equation (1), satisfying (2). These are the eigenfunctions of the
problem. cOlTesponding to the eigenvalues An = cll7TIL.
Step 3. Solution of the entire problem. Fourier series. So far we have solutions (8)
satisfying the boundary conditions (2). To oblain a solution that also satisfies the initial
condition (3), we consider a series of these eigenfunctions,
ex;
(9)
u(x, t)
=
2:
n=1
un(x, t)
=
2:
ex;
117TX
Bn sin - - e-J. n 2t
n=1
L
( An
=
CII7T)
L
.
SEC 12.5
555
Heat Equation: Solution by Fourier Series
From this and (3) we have
n17X
x
L
=
u(x, 0)
Bn sin
L
=
j(x).
n=l
Hence for (9) to satisfy (3), the Bn' s must be the coefficients of the Fourier sine series,
as given by (4) in Sec. 11.3; thus
(10)
2
L
= -
Bn
fL
1117.\
I(x) sin - - dr
(11
L
0
= 1. 2... ').
The solution of our problem can be established, assuming that I(x) is piecewise
continuous (see Sec. 6.1) on the interval 0 ~ x ~ L and has one-sided derivatives (see
Sec. 11.1) at all interior points of that interval; that is, under these assumptions the series
(9) with coefficients (10) is the solution of our physical problem. A proof requires
knowledge of uniform convergence and will be given at a later occasion (Probs. 19, 20
in Problem Set 15.5).
Because of the exponential factor, all the terms in (9) approach zero as t approaches
infinity. The rate of decay increases with 11.
E X AMP L E 1
Sinusoidal Initial Temperature
Find the temperature 1I(.I. r} in a laterally insulated copper bar 80 cm long if the initial temperature i~
100 sin (Tlx/80) °C and the ends are kept at O°c. How long will it take for the maximum temperature in the
bar to drop to 50°C? First guess. then calculate. Physical data for copper: density 8.92 gm/cm3 • specific heat
0.092 call(gm °C), thermal conductivity 0.95 cal/(cm sec 0c}.
Solution.
The initial condition gives
u(x. O}
=
x
2::
1l7T\"
Bn sin
80 =
7TX
f(x}
=
100 sin
80 .
1'1-1
Hence. by inspection or from (9) we get Bl = 100. B2 = B3 = ... = O. In (9) we need A12 =
where c 2 = KI(ap) = 0.95/(0.092' 8.92) = 1.158 [cm2/sec]. Hence we obtain
The solution
(9)
2
C T121L2,
is
U(l
.,
TlX
t) = 100 sin e-O.00l785t
80
.
Also, 100e-O.001785t = 50 when r = (\nO.5)/(-0.00l785) = 388 rsec] = 6.5 [min1. Does your guess. or at
least its order of magnitude. agree with this result'!
•
E X AMP L E 1
Speed of Decay
Solve the problem in Example 1 when the initial temperature is 100 sin (3T11/80) °C and the other data are as
before.
Solutioll. In (9). instead of 11 = I we now have 11 = 3. and A32 = 32A12 = 9 • 0.001 785 = 0.01607, so that
the solution now is
3T1x
001607t
!l(x r) = 100 sin - - e- .
,
80
.
Hence the maximum temperature drops to SO°C in t = (in 0.5)/( -0.01607) = 43 [secondsl, which is much
faster (9 times as fast as in Example I; why?).
556
CHAP. 12
Partial Differential Equations (PDEs)
Had we chosen a bigger 11. the decay would have been still faster. and in a sum or series of such terms, each
term has it~ own rate of decay. and terms with large 11 are practically 0 after a very short time. Our next example
is of this type. and the curve in Fig. 292 corresponding to r = 0.5 looks almost like a sine curve; that is, it is
practically the graph of the first term of the solution.
•
~=O
"I
L-\
n
x
x
u~
x
u
I_--_-:t = 2
~
=:-:--"1
x
Fig. 292. Example 3. Decrease of temperature
with time t for L = 'if and c = 1
E X AMP L E 3
"Triangular" Initial Temperature in a Bar
Find the temperalllre in a laterally insulated bar of length L whose ends are kept at temperature 0, assuming that
the initial temperature is
if
f(x) = {
0<.l<Ll2,
x
L-x
if
LI2 < x < L.
(The uppermost part of Fig. 292 shows this function for the special L = 17.)
Solutioll.
From (10) we get
2 (JLl2x sin 1117<
L dx +
Bn = L
o
r
1l17X) .
(L - x) sin - - dx
L
L/2
Integration gives Bn = 0 if II is even.
4L
Bn =
2
11 7i
2
(n = I, 5, 9, ... )
and
Bn = -
4L
f1
2
'iT
2
(II =
3,7, 11, .. ').
(see also Example 4 in Sec. 11.3 with k = Ll2). Hence the solution is
!I(x, t) =
Cl7)2 t ] -"9I ~in L317X
4L [ sin L
17" exp [- ( L
17
2
exp
[-
Figure 292 shows that the temperalllre decreases with increasing t. because of the heat loss due to the cooling
of the ends.
Compare Fig. 292 and Fig. 288 in Sec. 12.3 and comment.
•
SEC. 12.5
557
Heat Equation: Solution by Fourier Series
E X AMP L E 4
Bar with Insulated Ends. Eigenvalue 0
Find a solution formula of (I). (3) with (2) replaced by the condition that both ends of the bar are insulated.
Solution.
Physical experiments show that the rate of heat flow is proportional to the gradient of the
temperature. Hence if the ends x = 0 and x = L of the bar are in~ulated. so that no heat can t10w through the
ends. we have grad u = lIx = iJuliJx and the boundary conditions
C2"";
" x CO,1) =
O.
"xCL.
Since II(X. t) = F{x)G{t). this gives u,,{O, t) = F'(O)G(t)
we have F' (x) = - Ap sin px + Bp cos px. '0 that
FiCo)
= Bp = 0
=
f)
for all r.
= 0
0 and "x(L, t)
and then
F'CL)
=
=
F'{L)G(t)
-Ap sinpL
=
O. Differentiating 0).
= O.
The second of these conditions gives p = Pn = Il'TTIL. (11 = O. l. 2, .. '). From this and (7) with A = I
and B = 0 we get Fn{x) = cos (Il'TTxIL). (II = 0, l. 2... '). With G n as before, thi~ yield~ the eigenfunctions
(II)
UniX. t)
117iX
-A 21
Len
= Fnl,)GnltJ = An cos
(n =
0, I,"'J
corresponding to the eigenvalues An = cn7TlL. The latter are as before, but we now have the additional eigenvalue
Ao = 0 and eigenfunction 110 = consr, which is the solution of the problem if the initial temperature fIx) is
constant. This ,hows the remarkable fact that a separatioll cOllstallt call very well be zero. alld ;;,ero call be all
eigellvalue.
Furthermore. whereas (8) gave a Fourier sine series. we now get from (II) a Fourier cosine series
x
(ll)
II(X. t)
=
x
L
=
ll.ix. 1)
Il7iX
L
n=O
11.=0
An co~ - - e -A"
L
2
t
(
An
=
C/l7T)
L'
Its coefficients result from the initial condition (3),
x
tI{x. 0)
=
f17TJ..-
L
An cos
L
= fIx),
n-O
in the form (2). Sec. 11.3. that is.
(3)
E X AMP L E 5
±f
Ao =
L
o
An =
fIx) dx.
2
L
f
L
o
I17TX
fix) cos - - dx.
L
11
•
= 1.2."
"Triangular" Initial Temperature in a Bar with Insulated Ends
Find the temperature in the bar in Example 3, assuming that the ends are insulated (instead of being kept at
temperature 0).
Solutioll.
For the triangular initial temperature. (13) gives Ao = Ll4 and (see also Example 4 in Sec. 11.3
with k = U2)
An
2
= -
L
[
f
L/2
X
11 'iTt
cos - - dx
0
L
+
f
L
]
Il1TX
2L
(L - t) cos - - dx
L/2
(
L
2 cos
117T
""2
-
COSIl7r-
I) .
Hence the solution {I 2) is
L
8L { I
2nt
[
- cos - - exp 4
~
22
L
u{x r) = -
,
( L2e7T)2 I ]
+
I
62 cos
L67TX
exp
[- ( L6C7T)2 r]
+...} .
We see that the terms decrease with increasing t. and II - 4 Ll4 as t -> "'; this is the mean value of the initial
temperature. Thi~ is plausible because no heat can escape from this totally insulated bar. In contrast. the cooling
of the ends in Example 3 led to heat loss and u -> O. the temperature at which the ends were kept.
•
558
CHAP. 12
Partial Differential Equations (PDEs)
Steady Two-Dimensional Heat Problems.
Laplace's Equation
We shall now extend our discussion from one to two space dimensions and consider the
two-dimensional heat equation
for steady (that is. time-indepelldem) problems. Then all/at = 0 and the heat equation
reduces to Laplace's equation
(14)
(which has already occUlTed in Sec. 10.8 and will be considered fUlther in
Secs. 12.7-12.10). A heat problem then consists of this POE to be considered in some
region R of the xy-plane and a given boundary condition on the boundary curve C of R
This is a boundary value problem (BVP). One calls it:
First BVP or Dirichlet Problem if u is prescribed on C ("Dirichlet boundary
condition")
Second BVP or ~eumann Problem if the normal derivative u.,
prescribed on C ("Neumann boundary condition")
=
aU/all is
Third BVP, Mixed BVP, or Robin Problem if 1I is prescribed on a portion of C
and u" on the rest of C ("Mixed buundary condition").
y
u = {(x)
b~----------~--------~
u=o
u=o
R
o-r----------u-=-o----------~--------x
o
Fig. 293.
a
Rectangle R and given boundary values
Dirichlet Problem in a Rectangle R (Fig. 293). We consider a Dirichlet problem for
Laplace's equation (14) in a rectangle R. assuming that the temperature u(x, y) equals a
given function f(x) on the upper side and 0 on the other three sides of the rectangle.
We solve this problem by separating variables. Substituting u(x, y) = F(x)G(y) into
(14) written as Uxx = -Uyy , dividing by FG, and equating both sides to a negative constant,
we obtain
SEC. 12.5
Heat Equation: Solution by Fourier Series
559
-k.
From this we get
+ kF =
d'(2
0,
and the left and right boundary conditions imply
F(O) = 0,
This gives k
=
F(a) = O.
and
(1l7T/a)2 and corresponding nonzero solutions
F(x)
(15)
The ODE for G with k
=
=
FnlX)
= sin
1l7T
-x,
a
11
= L 2,···.
(117T/a)2 then becomes
2
d .G _ (1l7T)2 G
d\"2
a
= O.
Solutions are
Now the boundary condition II = 0 on the lower side of R implies that Gn(O) = 0; that
is, Gn(O) = An + En = 0 or En = -An. This gives
From this and (15), writing 2An
(16)
un(x, y)
=
= A~,
we obtain as the eigenfunctions of our problem
Fn(x)GnCv)
x
.
Il7TX.
Il7TY
= A':' sm - - smh - - .
a
a
These solutions satisfy the boundary condition u = 0 on the left, right. and lower sides.
To get a solution also satisfying the boundary condition u(x, b) = f(x) on the upper
side, we consider the infinite series
co
u(x, y) = ~ un(x, y).
71=1
From this and (16) with y = b we obtain
DC
*
Il7TX
117Tb
a
a
u(x. b) = f(x) = ~ An sin - - sinh - - .
We can write this in the form
U(x,
b) = ~1 (A!
sinh
n:b)
.
Il7TX
sm--.
a
560
CHAP. 12
Partial Differential Equations (PDEs)
This shows that the expressions in the parentheses must be the Fourier coefficients bn of
f(x); that is, by (4) in Sec. 11.3,
bn
*
fa j(x) sin - - dx.
2
a
n7Tb
Il7TX
= An sinh - - = a
a
0
From this and (16) we see that the solution of our problem is
"
"* sin -a- sinh -a- '
""
u(x, y)
(17)
x
= .c.. un(x, y) = .c.. An
n7TX
117TY
where
A~
(18)
=
2
a sinh (l17Tbla)
fa f(x) sin - - dx.
n7TX
0
a
We have obtained this solution formally, neither considering convergence nor showing
that the series for u, U:rx' and U yy have the right sums. This can be proved if one assumes
that f and f' are continuous and f" is piecewise continuous on the interval 0 ~ x ~ a.
The proof is somewhat involved and relies on uniform convergence. It can be found in
[C4] listed in App. l.
Unifying Power of Methods. Electrostatics, Elasticity
The Laplace equation (14) also governs the electrostatic potential of electrical charges in
any region that is free of these charges. Thus our steady-state heal problem can also be
interpreted as an electrostatic potential problem. Then (17), (18) is the potential in the
rectangle R when the upper side of R is at potential f(x) and the other three sides are
grounded.
Actually, in the steady-state case, the two-dimensional wave equation (to be considered
in Secs. 12.7, 12.8) also reduces to (14). Then (17), (18) is the displacement of a rectangular
elastic membrane (rubber sheet, drumhead) that is fixed along its boundary, with three
sides lying in the x),-plane and the fourth side given the displacement f(x).
This is another impressive demonstration of the unifying power of mathematics. It
illustrates that entirely different physical systems may have the same mathematical model
and can thus be treated by the same mathematical methods .
..,_
... •
~-
1. WRITING PROJECT. Wave and Heat Equations.
Compare the two PDEs with respect to type, general
behavior of eigenfunctions. and kind of boundary and
initial conditions and resulting practical problems. Also
discuss the difference between Figs. 288 in Sec. 12.3
and 292.
2. (Eigenfunctions) Sketch (or graph) and compare the
first three eigenfunctions (8) with Bn = 1, c = I.
L = 7T for t = 0, 0.2, 0.4, 0.6, 0.8. 1.0.
3. (Decay) How does the rate of decay of (8) with fIxed
/J depend on the specific heat, the density, and the
thermal conductivity of the material?
SEC. 12.5
Heat Equation: Solution
561
by Fourier Series
4. If the first eigenfunction (8) of the bar decreases to half
its value within 10 sec, what is the value of the
diffusi vity?
15-91
LATERALLY INSULATED BAR
A laterally insulated bar of length 10 cm and constant
cross-sectional area I cm2 , of density 10.6 gmlcm3 , thermal
conductivity 1.04 call( cm sec DC), and specific heat
0.056 cal/(gm DC) (this corresponds to silver, a good heat
conductor) has initial temperature f(x) and is kept at ODC
at the ends x = 0 and x = 10. Find the temperature u(x, t)
at later times. Here, f(x) equals:
5. f(x)
sin 0.411X
6. f(x)
sin 0.11lx + i sin 0.2m:
7. f(x)
0.2x if 0 < x < 5 and 0 otherwise
8. f(x}
I - 0.21x - 51
9. f(x) = x if 0 < x < 2.5, f(x) = 2.5 if2.5 < x < 7.5,
f(x)
=
10 - x if 7.5 < x < 10
(Arbitrar~ temperatures at ends) If the ends x = 0
and x = L of the bar in the text are kept at constant
temperatures VI and V 2 , respectively, what is the
temperature UI(X) in the bar after a long time
(theoretically, as t -'? "YO)? First guess, then calculate.
11. Tn Prob. 10 find the temperature at any time.
12. (Changing end temperatures) Assume that the ends
of the bar in Probs. 5-9 have been kept at 100DC for a
long time. Then at some instant. call it t = 0, the
temperature at x = L is suddenly changed to ODC and
kept at ODC, whereas the temperature at x = 0 is kept
at 100De. Find the temperature in the middle of the bar
at t = L 2. 3, 10, 50 sec. First guess, then calculate.
10.
BAR UNDER ADIABATIC CONDITIONS
"Adiabatic" means no heat exchange with the
neighborhood. because the bar is completely insulated, also
at the ends. PhysicolTnformation: The heat flux at the ends
is proportional to the value of aulax there.
13. Show that for the completely insulated bar.
ux(O, t) = 0, uAL, t) = 0, /leX, t) = f(x) and separation
of variables gives the following solution, with An given
by (2) in Sec. 11.3.
u(x. t) = Ao
114-19J
C
=
14.
16.
18.
"" An cos ";x e-{cn7T/L)2t
+ L
21. The boundary condition of heat transfer
(19)
-u x (1I, t) = k[II('IT. t) -
uo]
applies when a bar of length 'IT with c = I is laterally
insulated, the left end x = 0 is kept at O°C, and at the
right end heat is flowing into air of constant
temperature uo. Let k = I for simplicity, and 110 = O.
Show that a solution is lI(x, t) = sin px e- p2t • where
P is a solution of tan p1I = - p. Show graphically
that this equation has infinitely many positive solutions
PI> P2, P3, ... , where Pn > 11 - i and
lim (Pn -
n_cc
11
+ ~)
=
O. (Formula (19) is also known
as radiation boundary condition, but this is
misleading; see Ref. [C3], p. 19.)
22. (Discontinuous f) Solve (\), (2), (3) with L = 11
and f(x) = Vo = const (=1= 0) if 0 < x < 11/2,
f(x) = 0 if 1112 < x < 11.
23. (Heat flux) The heat fiLix of a solution II(X, t) across
x = 0 is defined by cp(t) = - KlIx(O, t). Find CPU) for
the solution (9). Explain the name. Is it physically
understandable that cp goes to 0 as t -'? x?
OTHER HEAT EQUATIONS
24. (Bar with heat generation) If heat is generated at a
constant rate throughout a bar of length L = 'IT with
initial temperature f(x) and the ends at x = 0 and
'IT are kept at temperature 0, the heat equation is
2
U t = c u xx + H with constant H > O. Solve this
problem. Hint. Set u = v - Hx(x - 'IT)/(2c 2 ).
25. (Convection) If heat in the bar in the text is free to
flow through an end into the surrounding medium
kept at O°C, thePDEbecomesv t = c 2 v xx - f3v. Show
that it can be reduced to the form (I) by setting
vex, t) = II(X, t)w(t).
26. Consider v t = c 2 v xx - v (0 < x < L, t > 0),
v(O, t) = 0, veL, t) = 0, vex, 0) = f(x), where the term
-v models heat transfer to the surrounding medium
kept at temperature O. Reduce this PDE by setting
vex, t) = u(x, t)w(t) with w such that U is given by (9),
(10).
27. (Nonhomogeneous heat equation) Show that the
problem modeled by
n~l
Find the temperature in Prob.
1, and
f(x) = t"
15. f(x)
17. f(x)
f(x) = 0.5 cos 4x
19. f(x)
f(x) = ~11 - Ix - ~1I1
13 with L
=
1
=
112 - x 2
= (x -
11.
~11)2
20. Find the temperature of the bar in Prob. 13 if the left
end is kept at ODC, the right end is insulated, and the
initial temperature is Vo = const.
and (2), (3) can be reduced to a problem for the
homogeneous heat equation by setting
u(x, t) = vex, t)
+
w(x)
and determining w so that v satisfies the homogeneous
PDE and the conditions v(O, t) = veL, t) = 0,
v(x.O) = f(x) - w(x). (The term Ne- ax may represent
heat loss due to radioactive decay in the bar.)
CHAP. 12
562
28-351
Partial Differential Equations (PDEs)
TWO-DIMENSIONAL PROBLEMS
28. (Laplace equation) Find the potential in the rectangle
o ~ x ~ 20, 0 ~ y ~ 40 whose upper side is kept at
potential 220 V and whose other sides are grounded.
29. Find the potential in the square 0 ~ x ~ 2, 0 ~ y ~ 2
if the upper side is kept at the potential sin 47TX and the
other sides are grounded.
30. CAS PROJECT. Isotherms. Find the steady-state
solutions (temperatures) in the square plate in Fig. 294
with a = 2 satisfying the following boundary
conditions. Graph isotherms.
(a) II = sin TTX on the upper side. 0 on the others.
(b) u = 0 on the vertical sides. assuming that the other
sides are perfectly insulated.
(c) Boundary conditions of your choice (such that the
solution is not identically zero).
YI
a~
-,
Fig. 294.
12.6
a
x
Square plate
31. (Heat flow in a plate) The faces of the thin square
plate in Fig. 294 with side a = 24 are perfectly
insulated. The upper side is kept at 20°C and tl1e other
sides are kept at O°C. Find the steady-state temperature
II(X, y) in the plate.
32. Find the steady-state temperature in the plate in Prob.
31 if the lower side is kept at UOdc, the upper side at
U I dc, and the other sides are kept at O°c. Hint: Split
into two problems in which the boundary temperature
is 0 on three sides for each problem.
33. <Mixed boundary value problem) Find the steady~tate temperature in the plate in Prob. 31 with the upper
and lower sides perfectly insulated. the left side kept
at O°c. and the right side kept at f(y)°C.
34. (Radiation) Find steady-state temperatures in the
rectangle in Fig. 293 with the upper and left sides
perfectly insulated and the right side radiating into a
medium at O°C according to 1I,.(a. y) + hu(a, y) = O.
" > 0 constant. (You will get many solutions since no
condition on the lower side is given.)
35. Find formulas similar to (\7). (18) for the temperature
in the rectangle R of the text when the lower side of R
is kept at temperature f(x) and the other side~ are kept
at O°c.
Heat Equation: Solution by
Fourier Integrals and Transforms
Our discussion of the heat equation
(1)
in the last section extends to bars of infinite length, which are good models of very long
bars or wires (such a<; a wire of length, say, 300 ft). Then the role of Fourier series in the
solution process will be taken by Fourier integrals (Sec. 11.7).
Let us illustrate the method by solving (1) for a bar that extends to infinity on both
sides (and is laterally insulated as before). Then we do not have boundary conditions. but
only the initial condition
(2)
u(x, 0)
=
f(x)
(-co
where f(x) is the given initial temperature of the bar.
To solve this problem, we start as in the last section, substituting uex, t)
into (1). This gives the two ODEs
(3)
< x<
ro)
F(x)G(t)
[see (5), Sec. 12.5]
SEC. 12.6
563
Heat Equation: Solution by Fourier Integrals and Transforms
and
[see (6), Sec. 12.5].
(4)
Solutions are
= A cos px + B sin px
F(x)
and
respectively. where A and 8 are any constants. Hence a solution of (1) is
u(x, t; p) = FG = (A cuspx
(5)
+
8 sinpx)e-C2p2t.
Here we had to choose the separation constant k negative. k = _p2, because positive
values of k would lead to an increasing exponential function in (5), which has no physical
meaning.
Use of Fourier Integrals
Any series of functions (5), found in the usual manner by taking p as multiples of a fixed
number, would lead to a function that is periodic in x when t = O. However, since f(x)
in (2) is not assumed to be periodic, it is natural to use Fourier integrals instead of Fourier
series. Also, A and B in (5) are arbitrary and we may regard them as functions of p, writing
A = A(p) and 8 = 8(p). Now, since the heat equation (I) is linear and homogeneous.
the function
(6)
u(x, t)
= {:U(X, t;
p) dp
=
{c[A(P)
o
COSpl
+ 8(p)
sinpx]e-C2p2t dp
0
is then a solution of (I), provided this integral exists and can be differentiated twice with
respect to x and once with respect to t.
Determination ofA(p) and R(p) from the Initial Condition.
(7)
fleX, 0)
=
From (6) and (2) we get
LX
[A(p) cos px + 8(p) sin px] dp = f(x).
o
This gives A(p) and 8(p) in teons of f(x); indeed. from (4) in Sec. 11.7 we have
1
(8)
A(p)
= 7T
According to
written
(l *).
f
1
:x::
feu) cos pu du.
8(p)
= 7T
-00
f
:x::
feu) sinpu du.
-GO
Sec. 11.9. our Fourier integral (7) with these A(p) and B(p) can be
u(x, 0) =
I
7T
~X[:X::
icof(U) cos (px
- pu) du
]
dp.
Similarly, (6) in this section becomes
u(x, t)
= -1
7T
iX[J~
0
-::x;
feu) cos (px - pu) e-c2p2t du
] dp.
564
CHAP. 12
Partial Differential Equations (PDEs)
Assuming that we may reverse the order of integration. we obtain
(9)
u(x, t)
= -1
7T
IW [W
L
f(v)
cos (px - pv) dp
e-c2p2t
] dv.
0
-ex.
Then we can evaluate the inner integral by using the formula
oo
L
-"
e-~-
o
(10)
v;.
b2
cos 2bs ds = -2- e- .
[A derivation of (10) is given in Problem Set 16.4 (Team Project 28).] This takes the form
of our inner integral if we choose p = s/(eVt) as a new variable of integration and set
x-v
b=-2eVt .
Then 2bs = (x - v)p and ds = eVtdp, so that (10) becomes
L
x
o
cos (px - pv) dp =
e-c2p2t
V;
~
r
2evt
{ ( X - V)2 }
4e 2 t '
exp -
By inserting this result into (9) we obtain the representation
(11)
u(x, t)
=
I
~ r-:
I
2ev 7T1
oo
-00
f(v) exp {-( X - 2V)2 } dv.
4c t
Taking.:: = (v - x)f(2eVt) as a variable of integration, we get the alternative form
1
(12)
/lex. t)
= ~~
v
7T
I
oc
f(x
+
2ezVt) e-
z2
dz.
-x
If f(x) is bounded for all values of x and integrable in every finite interval, it can be
shown (see Ref. [ClOD that the function (11) or (12) satisfies (I) and (2) Hence this
function is the required solution in the present case.
E X AMP L E 1
Temperature in an Infinite Bar
Find the temperature in the infinite bar if the initial temperature is (Fig. 295)
I(x) =
Vo = cons'
if
~TI
o
if
Ixl> I.
{
< I,
{(xli
~
I I I
-1
Fig. 295.
x
Initial temperature in Example 1
SEC. 12.6
Heat Equation: Solution by Fourier Integrals and Transforms
Solution.
565
From (11) we have
u(x. t)
U,
•~
=
Ii {
exp -
-1
2CV7Tt
(x - v)2 }
--24c I
dv.
If we introduce the above variable of integration ::. then the integration over v from - I to 1 corre~ponds to the
integration over:: from (-I - x)/(2cVi) to (l - x)/(2cVi). and
(13)
!I(X,
I
y:;;:
(1
-xl/(2cVt)
Vot) = -
e- Z2 dz
(t> 0).
-0 +xl/(2cVtl
We mention that this integral is not an elementary function, but can be expressed in terms of the error function,
whose values have been tabulated. (Table A4 in App. 5 contains a few values; larger tables are listed in
Ref. [GRI] in App. I. See also CAS Project 10. p. 568.) Figure 296 shows I/(X. t) for Vo = 100°C,
c 2 = 1 cm2 /sec. and several values of t.
•
ulx,
-3
-2
-1
t)
o
Fig. 296. Solution u(x, t) in Example 1 for Uo = lOO·C,
c 2 = 1 cm 2 /sec, and several values of t
Use of Fourier Transforms
The Fourier transform is closely related to the Fourier integral, from which we obtained
the transform in Sec. 11.9. And the transition to the Fourier cosine and sine transform in
Sec. 11.8 was even simpler. (You may perhaps wish to review this before going on.)
Hence it should not surprise you that we can use these transforms for solving our present
or similar problems. The Fourier transform applies to problems concerning the entire axis.
and the Fourier cosine and sine transforms to problems involving the positive half-axis.
Let us explain these transform methods by typical applications that fit our present
discussion.
E X AMP L E 2
Temperature in the Infinite Bar in Example 1
Solve Example I using the Fourier transfonn.
Solutioll.
The problem consist, of the hear equation (I) and the initial condition (2), which in this example is
f(x) = Vo = COllst
if Ixl < I
and 0 otherwise.
Our strategy is Lo take the Fourier transform with respect to x and then to solve the resulting ordinary DE in t.
The details are as follows.
566
CHAP. 12
Partial Differential Equations (PDEs)
LeI i; = g;'(1I) denote the Fourier transform of II. regarded as afllllcliOlI ofx. From (l0) in Sec. 11.9 we see
that the heat equation (1) gives
On the left, assuming that we may interchange the order of differentiation and integmtion. we have
~(Ut) =
1
.~
V217
f""·
Ute-tWX dx =
-00
fcc.
au
lie-twX dx = -:;- .
I
=-a
at
V217
-00
at
Thus
au
at =
-c 2 w 2"u.
Since this equation involves only a derivative with respect to t but none with respect to w, this is a tirst-order
ordinary DE. with t as the independent variable and was a parameter. By separating variables (Sec. 1.3) we
get the general solution
2 2
u(w, t) = C(w)e -C w
t
with the arbitrary "constant" C(w) depending on the parameter w. The initial condition (2) yields the relationship
u(w, 0) ~ C(w) ~ J(w) = ~(f). Our intermediate result is
The inversion fonnula (7). Sec. 11.9. now gives the solution
(14)
In this SolUTion we may insert the Fourier transform
"
few) ~
I
fCC
ivw
vz;
_ccf(u)edu.
Assuming that we may inver! the order of integration, we then obtain
By the Euler formula (3). Sec. l1.9. the integrand of the inner integral equals
We see that it~ imaginary part is an odd function of w, so that its integral is O. (More precisely, this is the
principal part of the integral; see Sec. 16.4.) The real part is an even function of w, so that its integral from
-:x; to :x; equals twice the integral from 0 to x:
This agrees with (9) (withp = w) and leads to the further formulas (il) and (13).
E X AMP L E 3
Solution in Example 1 by the Method of Convolution
Solve the heat problem in Example I by the method of convolution.
Solution.
The beginning is as in Example 2 and leads to (14). that is,
•
SEC 12.6
567
Heat Equation: Solution by Fourier Integrals and Transforms
Now comes the crucial idea. We recognize that this is of the form (13) in Sec. 11.9. that
(16)
uCr. t)
= (j '" g)Cr) =
IX
i~.
I(w)g(w)eiwx dw
-00
where
(17)
Since. by the definition of convolution [(II l. Sec. 11.9],
(f
(18)
* g)(x)
=
{YO f(p)g(x
- p) dl',
-:lc
as our next and last step we must determine the inverse Fourier tmnsform g of g. For this we can use formula
9 in Table III of Sec. 11.10,
with a suitable a With c 2 t = 1I(4a) or a = 1I(4c 2 t). using (17) we obtain
Hence g has the inverse
e
Replacing x with x - I' and
(19)
suh~tituting
u(x, t) = (f
-XZ/(4c2 t)
this into (18) we finally have
I
* g)(x)
=
•
~
2c V7Tl
foo f(p) exp {-( X
- p)2 }
- - 2 - dp.
4c
-:>G
This solution formula of our problem agrees with (II l. We wrote (f
with respect to which we did not integrate.
E X AMP L E 4
.
I
* g)(x). without indicating the parameter I
•
Fourier Sine Transform Applied to the Heat Equation
If a latemlly insulated bar extends from" = 0 to infinity, we can use the Fourier sine transform. We let the
initial temperature be u(x, 0) = f(x) and impose the boundary condition 11(0, t) = O. Then from the heat equation
and (9b) in Sec. 11.8, since frO) = £1(0, 0) = 0, we obtain
This is a first-order ODE aUs/ill +
u
2 2
C 1l. s =
O. Its solution is
u
From the initial condition u(x. OJ = f(x) we have s ("', 0) = IS<w) = C(w). Hence
"s(w, t)
= is(w)e -dlw
2
t.
Taking the inverse Fourier sine transform and ,ubstituting
Is(w)
=
IT
~-;
{"'"f(P) sin lI'p dp
0
CHAP. 12
568
Partial Differential Equations (PDEs)
on the right. we obtain the solution formula
u(x. t) ~ -2
(20)
7T
L=L""
f(p) sin 1\'1' e - c
0
2
2
w t
sin lH dp dll".
0
Figure 297 shows (20) with c = I for f(x) = I if 0 ~ x ~ 1 and 0 otherwise. graphed over the xt-plane for
0;;;; x;;;; 2. 0.01 ;;;; t ~ 1.5. Note that the curves of u(x, t) for constant t resembk- those in Fig. 296 on p. 565 . •
Fig. 297.
11-71
SOLUTION IN INTEGRAL FORM
Using (6). obtain the solution of (I) in integral form
satisfying the initial condition u(x. m = lex). where
lex)
l if Ixl < a and 0 otherwise
2. f(x) = e- klxl (/.. > m
1.
3. f(x)
LI( I
+
X2).
[Use (15) in Sec. 11.7.]
4. I(x)
=
(sin xl/x. [Use Prob. 4 in Sec. 11.7.]
5. I(x)
=
(sin 7fx)/x. [Use Prob. 4 in Sec. 11.7.]
6. f(x) = x if Ixl < I and 0 otherwise
7. I(x) = Ixl if Ixl < 1 and 0 otherwise.
8. Verify that
II
in Prob. 5 satisfies the initial condition.
9. CAS PROJECT. Heat Flow. (a) Graph the basic
Fig. 296.
(b) In (a) apply animation to "see" the heat flow in
terms of the dccrease of temperature.
(c) Graph u(x, t) with c
xt-half-plane.
=
I as a surface over the upper
2
erfx = - ~
rX
J
0
e-
w2
This function is imp0l1ant in applied mathematics
and physics (probability theory and statistics.
thermodynamics. etc.) and fits our present discussion.
Regarding it as a typical case of a special function
defined by an integral that cannot be evaluated as in
elementary calculus. do the following.
(a) Sketch or gmph the bell-shaped curve [the curve
of the integrand in (21 )J. Show that erf x is odd. Show
that
I
b
e-
w2
dw =
2~
(erfb - erfa),
a
I
b
e-
w2
dw =
.y:;;: erf b.
-b
(b) Obtain the Maclaurin series of erf x from that
of the integrand. Use that series to compute a table of
erfx for x = OCO.OI)3 (meaning x = O. O.OL 0.02,
.... 3).
(c) Obtain the values required in (b) by an integration
command of your CAS. Compare accuracy.
to. CAS PROJECT. Error Function
(21)
Solution (20) in Example 4
dw
(d) [t can be shown that erf (x) = 1. Confirm this
experimentally by computing erf x for large x.
SEC. 12.7
569
Modeling: Membrane, Two-Dimensional Wave Equation
(e) Let J(x) = 1 when x> 0 and 0 when x < O. Using
erf(X) = 1, show that (12) then gives
1
u(x, t) =
• I
v 7f
I
x
(g) Show that
¢(t) =
1
=-"27f
IX e- s /2 ds
2
-ex:
2
e-
Z
dz
-x/(2cVtJ
(t> 0).
(1) Express the temperature (13) in tenns of the error
Here. the integral is the definition of the "distribution
function of the normal distribution" to be discussed in
function.
Sec. 24.8.
12.7
Modeling: Membrane,
Two-Dimensional Wave Equation
The vibrating string in Sec. 12.2 is a basic one-dimensional vibrational problem. Equally
important is its two-dimensional analog, namely, the motion of an elastic membrane. such
as a drumhead, that is stretched and then fixed along its edge. Indeed. setting up the model
will proceed almost as in Sec. 12.2.
Physical Assumptions
1. The mass of the membrane per unit area is constant ("homogeneous membrane").
The membrane is perfectly flexible and offers no resistance to bending.
2. The membrane is stretched and then fixed along its entire boundary in the xy-plane.
The tension per unit length T caused by stretching the membrane is the same at all
points and in all directions and does not change during the motion.
3. The deflection u{x. y. t) of the membrane during the motion is small compared to
the size of the membrane, and all angles of inclination are small.
Although these assumptions cannot be realized exactly, they hold relatively accurately for
small transverse vibrations of a thin elastic membrane, so that we shall obtain a good
model, for instance. of a drumhead.
Derivation of the POE of the Model ("Two-Dimensional Wave Equation") from
Forces. As in Sec. 12.2 the model will consist of a POE and additional conditions. The
POE will be obtained by the same method as in Sec. 12.2, namely, by considering the
forces acting on a small portion of the physical system, the membrane in Fig. 298 on the
next page, as it is moving up and down.
Since the deflections of the membrane and the angles of inclination are small. the sides
of the portion are approximately equal to Ax and Ay. The tension T is the force per unit
length. Hence the forces acting on the sides of the portion are approximately T Ax and
T Ay. Since the membrane is perfectly flexible, these forces are tangent to the moving
membrane at every instant.
Horizontal Components of the Forces. We first consider the horizontal components
of the forces. These components are obtained by multiplying the forces by the cosines of
the angles of inclination. Since these angles are small, their cosines are close to I. Hence
570
CHAP. 12
Partial Differential Equations (PDEs)
Membrane)
x
u
~
Z·;
___
.P".
4p
Tl'.y
TI'.X
~~T!'.y
- jl--""\-.-
a
Tl'.y
x+~x
1
{3
:
1
1 1
1
1
1 1
1
1 T!'>.x 1 1
1
1
1 1
1
1----.,,1
1
1
:
.,,/ -,. ____ 1
-----.:k.......-
I
1
---J..... "
/
. . . . :J
r
x+~x
Fig. 298. Vibrating membrane
the horizontal components of the forces at opposite sides are approximately equaL
Therefore, the motion of the particles of the membrane in a horizontal direction will be
negligibly small. From this we conclude that we may regard the motion of the membrane
as transversal; that is, each particle moves vertically.
Vertical Components of the Forces.
left side are (Fig. 298), respectively,
T ~y sin
f3
These components along the righr side and the
-T ~v sin a.
and
Here a and f3 are the values of the angle of inclination (which varies slightly along the
edges) in the middle of the edges, and the minus sign appears because the force on the
left side is directed downward. Since the angles are small, we may replace their sines by
their tangents. Hence the resultant of those two vertical components is
T ily (sin
(I)
f3 - sin a) = T .ly (tan f3 - tan a)
=
where subscripts x denote partial
T~)' [ux(x
derivative~
+
~X, )'1) -
ux(X, )'2)]
and Yl and .\'2 are values between), and
)' + .ly. Similarly. the resultant of the vertical components of the forces acting on the
other two sides of the portion is
(2)
where
Xl
and
X2
are values between
X
and x
+
.lx.
Newton's Second Law Gives the POE of the Model. By Newton's second law (see
Sec. 2.4) the sum of the forces given by (I) and (2) is equal to the mass p~A of that small
SEC. 12.8
Rectangular Membrane. Double Fourier Series
571
portion times the acceleration a2 u/at 2 ; here p is the mass of the undeflected membrane
per unit area, and .lA = Llx Lly is the area of that portion when it is undeflected. Thus
a2 u
p Llx~)' at 2
+
=
T Lly [ux(x
+
T ~.\ [lIiXI' )'
.lX• .'"1) - lIx(X, )'2)]
+ .1y)
- Uy (X2, y)]
where the derivative on the left is evaluated at some suitable point (x, Y) corresponding
to that portion. Division by p .lx LlY gives
2
a u
at 2
=
+ .lx. )'1) - II"J', }'2)
p.lx
T [lI x (X
+
lIy(x., y
+ ~)') -
lIiX2. Y) ] .
.ly
If we let Llx and Lly approach zero, we obtain the PDE of the model
(3)
p
This PDE is called the two-dimensional wave equation. The expression in parentheses
is the Laplacian V2 u of u (Sec. 10.8). Hence (3) can be written
(3')
Solutions of the wave equation (3) will be obtained and discussed in the next section.
12.8
Rectangular Membrane.
Double Fourier Series
The model of the vibrating membrane for obtaining the displacement u(x, y, t) of a point
(x, y) of the membrane from rest (u = 0) at time tis
(1)
(2)
(3a)
(3b)
u
= 0 on the boundary
u(x, y. 0)
=
f(x. y)
y. 0)
=
g(x. v).
lit (x,
Here (1) is the two-dimensional wave equation with c 2 = TIp just derived, (2) is the
boundary condition (membrane fixed along the boundary in the xy-plane for all times
t ~ 0), and (3) are the initial conditions at t = O. consisting of the given initial
displacement (initial shape) f(x, y) and the given initial velocity g(x, y), where Ut = au/at.
We see that these conditions are quite similar to those for the string in Sec. 12.2.
572
CHAP. 12
Partial Differential Equations (PDEs)
y
bt------,
R
a
Fig. 299.
x
Rectangular membrane
As a first important model, let us consider the rectangular membrane R in Fig. 299,
which is simpler than the circular drumhead to follow. Then the boundary in (2) is the
rectangle in Fig. 299. We shall solve this problem in three steps:
Step 1. By separating variables, setting !leX, y, t) = F(x, y)C(t) and later F(x, y) = H(x)Q(y)
we obtain from (I) an ODE (4) for G and later from a PDE (5) for F two ODEs (6) and
(7) for Hand Q.
Step 2. From the solutions of those ODEs we determine solutions (13) of (1)
("eigenfunctions" Limn) that satisfy the boundary condition (2).
Step 3. We compose the
Umn
into a double series (14) solving the whole model (I), (2), (3).
Step 1. Three ODEs From the Wave Equation (1)
To obtain ODEs from (I), we apply two successive separations of variables. In the first
separation we set u(x, y, t) = Flx, y)G(t). Substitution into (I) gives
where subscript'> denote partial derivatives and dots denote derivatives with respect to t.
To separate the variables, we divide both sides by c 2 FG:
C
c2C
=
1
F (Fxx
+
Fyy).
Since the left side depends only on t. whereas the right side is independent of t. both sides
must equal a constant. By a simple investigation we see that only negative values of that
constant will lead to solutions that satisfy (2) without being identically zero: this is similar
to Sec. 12.3. Denoting that negative constant by - v 2 , we have
This gives two equations: for the "time function" G(t) we have the ODE
(4)
where 11.
= cv.
and for the "amplitude function" F(x. y) a PDE. called the two-dimef15iofl({/ Helmholtz3
equation
(5)
3HERMANN VON HELMHOLTZ (!821-J894), German physici~t, known for his basic work in
thermodynamics, fluid flow. and acoustics.
SEC. 12.8
573
Rectangular Membrane. Double Fourier Series
Separation of the Helmholtz equation is achieved if we set F(x, y)
substitution of this into (5) we obtain
=
H(x)Q(y). By
To separate the variables, we divide both sides by HQ, finding
Both sides must equal a constant, by the usual argument. This constant must be negative,
say, -k 2 , because only negative values will lead to solutions that satisfy (2) without being
identically zero. Thus
This yields two ODEs for Hand Q, namely,
(6)
and
(7)
Step 2. Satisfying the Boundary Condition
General solutions of (6) and (7) are
H(x)
= A cos kx + B sin kx
and
Q(y)
= C cospy + D sinpy
with constant A. B. C, D. From II = FG and (2) it follows that F = HQ must be zero on
the boundary, that is, on the edges x = 0, x = a, Y = 0, Y = b; see Fig. 299. This gives
the conditions
H(O) = 0,
H(a)
= 0,
Q(O) = 0,
Q(b) = O.
Hence H(O) = A = 0 and then H(n) = B sin ka = O. Here we must take B
otherwise H(x) == 0 and F(x, y) == O. Hence sin ka = 0 or ka = 11177, that is,
k=
11177
a
(m integer).
*' 0 since
574
CHAP. 12
Partial Differential Equations (PDEs)
In precisely the same fashion we conclude that C = 0 and p must be restricted to the
values p = n7Tlb where n is an integer. We thus obtain the solutions H = Hm, Q = Qm
where
Hm(x)
I717TX
=
sin - a
171 =
117TY
and
Qn(Y) = sin -:- '
11
1.2.... ,
= 1,2, ....
As in the case of the vibrating string, it is not necessary to consider 111, n = - I, - 2, ...
since the corresponding solutions are essentially the same as for positive m and n, except
for a factor - I. Hence the functions
(8)
F mn(x, y) = Hm(x)Qn(Y) =
•
11l7TX
.
117TY
Sill - - Sill -b-
a
III =
,
11
1,2, ... ,
= I, 2, ... ,
are solutions of the Helmholtz equation (5) that are zero on the boundary of our membrane.
Eigenfunctions and Eigenvalues. Having taken care of (5), we tum to (4). Since
= 1J2 - k 2 in (7) and A = CIJ in (4). we have
p2
Hence to k =
l717Tla
and p =
l17Tlb
there corresponds the value
III
= 1,2, ... ,
Il
= 1,2, ... ,
(9)
in the ODE (4). A corresponding general solution of (4) is
It follows that the functions
(10)
umn(x, y, t)
=
111Ttn(.\:'
y.
t) =
(Brnn cos Amnt
+
F mn(x. y)Gmn(tt written out
*.
Bmn
SIn
Amnt)
. 1117TX . Il7TY
SIn - - SIn - a
b
with Amn according to (9), are solutions of the wave equation (I) that are zero on
the boundary of the rectangular membrane in Fig. 299. These functions are called the
eigenfunctions or characteristic jilllctiol1S. and the numbers Amn are called the
eigenvalues or characteristic values of the vibrating membrane. The frequency of U mn is
Amn I27T.
Discussion of Eigenfunctions. It is very interesting that, depending on a and b, several
functions Fm11 may correspond to the same eigenvalue. Physically this means that there
may exist vibrations having the same frequency but entirely different nodal lines (curves
of points on the membrane that do not move). Let us illustrate this with the following
example.
SEC. 12.8
Rectangular Membrane. Double Fourier Series
E X AMP L E 1
575
Eigenvalues and Eigenfunctions of the Square Membrane
Consider the square membrane with a = b = I. From (9) we obtain its eigenvalues
~
1""""22
+ n"-.
Autn = C7fV Ill'"
(II)
Hence Amn =
An",'
but for
111
*
F mn = sin
11
the correspouding functions
II/1n
sin 117T)'
are certainly different. For example. to
F12 =
sin
'lTX
and
A12 = A2l = C'lT\'
sin 2'lTV
Fum
= sin /I'lTT sin 1117T)'
'5 there correspond the two functions
and
F2l
= sin 2'lTX sin 'lTy.
Hence the corresponding solutions
and
have the nodal lines J = ~ and x = ~. respectively (see Fig. 300). Taking
obtain
Bl2 =
1 and B~2 =
8;1 =
O~ we
(12)
which represents another vibration corresponding to the eigenValue C7TVs. The nodal line of this function is the
solution of the equation
F12
+
B21F2l
= sin 'lTX sin 2'lTY +
B21
sin 2'lTX sin
'IT)" =
0
or, since sin 2a = 2 sin a cos a,
sin 1T.r sin
(13)
'IT.\'
(cos
'lTY
+
B21
cos
'lTx)
= O.
This solution depends on the value of B2l (see Fig. 301).
From \ I I) we see that even more than two functions may correspond to the same numerical value of Amn.
For example, the four functions F 1S• F S1 , F 47 • and F74 correspond to the value
because
This happens because 65 can be expressed as the sum of two squares of positive integers in several ways.
According to a theorem by Gauss, this is the case for every sum of two squares among whose prime factors
there are at least two different ones of the form 411 + I where II is a positive integer. In our case we have
•
65 = 5·13 = (4 + 1)(12 + I).
DOC]
'Tln
W
U
Un. Un.
U 21 • U 22•
=-10
B2l = -0.5
[J
I
I
I
I
I
Nodal lines of the solutions
un. U 31 in the case of
the square membrane
Fig. 300.
B21
, -_ _..L:.._----, B21 =-1
B21 =0
B2l =
" - - - - - - - - ' B21 =
0.5
1
Fig. 301. Nodal lines
of the solution (12) for
some values of B21
576
CHAP. 12
Partial Differential Equations (PDEs)
Step 3. Solution of the Model (1), (2), (3).
Double Fourier Series
So far we have solutions (10) satisfying (I) and (2) only. To obtain the solution that also
satisfies (3), we proceed as in Sec. 12.3. We consider the double series
x
u(x, y, t) =
x
L L
umn~x, y, t)
m=ln=l
(14)
=
x
x
2: L
+
(Bmn cos Amn t
m=ln=l
11l17X
'"
1l17"
B;nn sin Amnt) sin - - sin - - '
a
b
(without discussing convergence and uniqueness). From (14) and (3a), setting t
have
(15)
u(x, y, 0)
=
1Il17X
x
00
1117"
L L
Bmn sin - - sin -b'
m=ln=l
a
=
=
0, we
f(x, y).
Suppose that f(x. y) can be represented by (15). (Sufficient for this is the continuity of
f, afli)x, BflBy, a2 ftr)xBy in R.) Then (15) is called the double Fourier series of f(x, y)
Its coefficients can be determined as follows. Setting
:>0
(16)
=
KmCY)
1l17y
•
L
b
Bmn sm
n=l
we can write (15) in the form
2:
f(x, y) =
17l17X
Knb) sin - - .
a
7n=1
For fixed y this is the Fourier sine series of f(x, y), considered as a function of x. From
(4) in Sec. 11.3 we see that the coefficients of this expansion are
(17)
KmC\')
=
2
la
a
0
-
m17X
f(x, y) sin - - dx.
a
Furthermore, (16) is the Fourier sine series of Km(Y), and from (4) in Sec. 11.3 it follows
that the coefficients are
B
mn
2
= -b
lbK (,,)
0
m.
1117,,'
sin - - ' d".
b'
From this and (17) we obtain the generalized Euler formula
(18)
Bmn
=
4
-b
a
Ib1a f(x, y) sin - - sin - - ' dx dy
0 0
11117X
1117\'
a
b
17l
= 1, 2, ...
n = 1,2, .,.
SEC. 12.8
Rectangular Membrane. Double Fourier Series
577
for the Fourier coefficients of f(x, y) in the double Fourier series (15).
The Bmn in (14) are now determined in terms of f(x, y). To determine the B;;m, we
differentiate (14) termwise with respect to t; using (3b), we obtain
au
at
I
t=O
*
:L :L
ex;
00
=
.
IIl7TX
n7TY
BmnAmn Sin - - sin -b' = g(x, y).
a
m=l n=l
Suppose that g(x. y) can be developed in this double Fourier series. Then. proceeding as
before. we find that the coefficients are
4
abAmn
B,~n = - - -
(19)
JbJag(x,
0
m7TX
n7TV
a
b
111
y) sin - - sin --- dx d,'
0
.
n
=
1, 2, ...
=
1, 2, ....
Result. If f and g ill (3) are such that u can be represented by (14), then (14) with
coefficients (18) and (19) is the solution of the model (1), (2). (3).
E X AMP L E 2
Vibration of a Rectangular Membrane
Find the vibrations of a rectangular membrane of sides a = 4 ft and b = 2 ft (Fig. 302) if the tension is
12.5 Ib/ft. the dem,ity is 2.5 slugs/fr (as for light rubber). the initial velocity is O. ami the initial displacement is
(20)
y
2
1--------.
R
x
4
Membrane
Initial displacement
Fig. 302.
Solution.
c
2
= TIp = 12.512.5 = 5 [ft2/sec
4 JJ
2
0
J
2
1. Also. B~tn =
0.1(4x -
m7fX
117fl'
x 2 )(2y - y 2 ) sin - sin --dx dy
0
4
I
20
0 from (19). From (18) and (2m.
4
Bmn = - 4'2
Example 2
J
4
2
2
In7Tr
(4x - x 2 ) sin -4- dr
o
2
(2l' - y ) sin
2
117TY
dy.
0
Two integrations by parts give for the first integral on the right
(m
odd)
and for the second integral
(11 odd).
For even m or 11 we get O. Together with the factor 1120 we thus have B.rnn = 0 if 111 or 11 is even and
256· 32
Bmn =
20m
33
6
Il 7f
(m and
Il
both odd)
CHAP. 12
578
Partial Differential Equations (PDEs)
From this. (9), and (14) we obtain the answer
_
nl7TX
_
n7Ty
SlIl-- SIn--
4
(21)
= 0.426050 ( cos
+
1
27 cos
Vs1TVs
4
t
Vs1Tv'13
4
f
sin
sin
1TX
4
31TX
4
~in
sin
21Tl'
1Ty
2 +
+
J
27-
Vs1TV37
cos
I
729 cos
2
4
I
Vs1TV45
4
1
sin
sin
1TX
4
3m
4
31TY
SIn
-2-
31Tl'
)
sin ~ + . .. .
To discuss this solution, we note that the first term i~ very similar to the initial shape of the membrane. has no
nodal lines, and is by far the dominating term because the coefficients of the next terms are much smaller. The
second term has two horizontal nodal lines ly = 2/3, 4/3), the third term two vertical ones lx = 4/3, 8/3), the
fourth term two horizontal and two vertical ones, and so on.
•
1. (Frequency)
How does the frequency of the
eigenfunctions of the rectangular membrane change if
(a) we double the tension, (b) we take a membrane of
half the mass of the original one, (c) we double the
sides of the membrane? (Give reason.)
SQUARE MEMBRANE
2. Determine and sketch the nodal lines of the
eigenfunctions of the square membrane for m
3, 4 and n = I, 2, 3, 4.
=
I, 2,
If-~
Double Fourier Series. Represent f(x, y) by a
series (15), where 0 < x < I. 0 < Y < I.
3. f(x, y)
=
Fig. 303.
\
4. f(x, y) = x
5. f(x, y) = y
6. f(x, y) = x + y
7. f(x, y) = xy
8. f(x, y) = xy(1 - x)(l - y)
9. CAS PROJECT. Double Fourier Series. (a) Wlite a
program that gives and graphs partial sums of (\5).
Apply it to Probs. 4 and 5. Do the graphs show that
those partial sums satisfy the boundary condition (3a)?
Explain Why. Why is the convergence rapid?
(b) Do the tasks in (a) for Prob. 3. Graph a portion,
say, 0 < x < ~, 0 < Y < ~, of several partial sums on
common axes, so that you can see how they differ. (See
Fig. 303.)
(c) Do the tasks in (b) for functions of your choice.
Partial sums 52 •2 and 510.10
in CAS Project 9b
10. CAS EXPERIMENT. Quadruples of F mn- Write a
program that gives you four numerically equal "mn in
Example I, so that four different Fmn correspond to
it. Sketch the nodal lines of F 18 , F 81 , F 47 , F74 in
Example I and similarly for further F mn that you will
find.
111-131
Deflection. Find the deflection u(x, y, t) of the
square membrane of side 7r and c 2 = 1 if the initial velocity
is 0 and the initial deflection is
11. k sin 2x sin 5y
12. 0.1 sin x siny
13. O.lxy( 7r
-
x)( 7r
-
y)
RECTANGULAR MEMBRANE
14. VerifY the discussion of the terms of (21) in Example 2.
15. Repeat the task of Prob. 2 when a = 4 and b = 1.
SEC. 12.9
21. f(x, y) = xy(a 2
16. Verify the calculation of Bmn in Example 2 by
integration by parts.
17. Find eigenvalues of the rectangular membrane of sides
a = 2 and b = I to which there correspond two or
more different (independent) eigenfunctions.
18. (Minimum property) Show that among all rectangular
membranes of the same area A = ab and the same c
the square membrane is that for which Un [see (10)]
has the lowest frequency.
-
x 2 )(b 2
-
y2)
22. j(x. y) = xy(a - x)(b - y)
23. (Deflection) Find the deflection of the membrane of
sides a and b with c 2 = I for the initial deflection
f (x, y)
3'7TX
. 4'7TY
. I veI oClty
. 0.
. sm
- sm
- an d"mltla
a
b
=
24. Repeat the task in Prob. 23 with c 2 = 1, for f(x, y) as
in Prob. 22 and initial velocity O.
25. (Forced vibrations) Show that forced vibrations of a
membrane are modeled by the PDE U tt = C 2 V 2 U + PIp,
where P(x, y, t) is the external force per unit area acting
perpendicular to the xy-plane.
119-221
Double Fourier Series. Represent f(x, y)
(0 < x < a, 0 < Y < b) by a double Fourier series (15).
19. f(x, y) = k
20. f(x, y) = 0.25x)"
12.9
579
Laplacian in Polar Coordinates. Circular Membrane. Fourier-Bessel Series
Laplacian in Polar Coordinates.
Circular Membrane.
Fourier-Bessel Series
In boundary value problems for PDEs it is a general principle to use coordinates in which
the fonTIula for the boundary is as simple as possible. Since we want to discuss circular
membranes (drumheads), we first transform the Laplacian in the wave equation (1),
Sec. 12.8,
(1)
(subscripts denoting partial derivatives) into polar coordinates
e=
Hence x
= r cos e,
y=
r sin
v
arctan -'-- .
x
e. By the chain rule (Sec. 9.6) we obtain
Differentiating once more with respect to x and using the product rule and then again the
chain rule gives
(2)
Also, by differentiation of rand
rx
=
x
-v"~=+=y=2
x
r
(J
we find
e = -----;:x
1
+
(Y/X)2
y
r2 .
580
CHAP. 12
Partial Differential Equations (PDEs)
Differentiating these two formulas again, we obtain
r xx
=
exx =
r
-\"-
(-~)
1'3
r
x
= 2xy
1'4·
We substitute all these expressions into (2). Assuming continuity of the first and second
partial derivatives, we have U rfJ = lie,., and by simplifying,
(3)
In a similar fashion it follows that
(4)
By adding (3) and (4) we see that the Laplacian of II in polar coordinates is
(5)
Circular Membrane
Circular membranes occur in drums, pumps, microphones, telephones, and so on. This
accounts for their great importance in engineering. Whenever a circular membrane is plane
and its material is elastic, but offers no resistance to bending (this excludes thin metallic
membranes!), its vibrations are modeled by the two-dimensional wave equation in polar
coordinates obtained from (l) with y 2 u given by (5), that is,
(6)
Y
R
x
Fig. 304. Circular
membrane
p
We shall consider a membrane of radius R (Fig. 304) and determine solutions u(r. t)
that are radially symmetric. (Solutions also depending on the angle e will be discussed in
the problem set.) Then um] = 0 in (6) and the model of the problem (the analog of (1).
(2), (3) in Sec. 12.8) is
(7)
(8)
(9a)
u(R, t)
= 0 for all t
u(r, 0)
=
~
0
fer)
(9b)
Here (8) means that the membrane is fixed along the boundary circle
deflection fer) and the initial velocity g(r) depend only on 1', not on
expect radially symmetric solutions u(r, t).
l'
e,
= R. The initial
so that we can
SEC. 12.9
581
Laplacian in Polar Coordinates. Circular Membrane. Fourier-Bessel Series
Step 1. Two ODEs From the Wave Equation (7).
Bessel's Equation
Using the method of separation of variables, we first determine solutions u( r, t) = W(r) GCt).
(We write W, not F because W depends on r, whereas F, used before, depended on x.)
Substituting u = WG and its derivatives into (7) and dividing the result by c 2 WG, we get
G W1 (W"+ -1)
W'
r
-2 - = c G
where dots denote derivatives with respect to t and primes denote derivatives with respect
to r. The expressions on both sides must equal a constant. This constant must be negative,
say, -k 2 , in order to obtain solutions that satisfy the boundary condition without being
identically zero. Thus,
This gives the two linear ODEs
where A = ck
(10)
and
w"+
(11)
r
We can reduce (11) to Bessel's equation (Sec. 5.5) if we set s = kr. Then IIr
retaining the notation W for simplicity, we obtain by the chain rule
W'=
dW
dW ds
dr
ds
=
kls and,
2
dW
=-k
dr
ds
and
d W 2
W " =-2-k.
ds
By substituting this into (11) and omitting the common factor k 2 we have
(12)
d2W
ds
2
1 dW
+ -
S
ds
+ W= O.
This is Bessel's equation (I), Sec. 5.5, with parameter v =
o.
Step 2. Satisfying the Boundary Condition (8)
Solutions of (12) are the Bessel functions 10 and Yo of the first and second kind (see
Secs. 5.5, 5.6). But Yo becomes infinite at 0, so that we cannot use it because the deflection
of the membrane must always remain finite. This leaves us with
(13)
W(r) = 10(5) = lo(kr)
(5
= kr).
582
CHAP. 12
Partial Differential Equations (PDEs)
On the boundary r = R we get W(R) = 10(kR) = 0 from (8) (because G == 0 would imply
u == 0). We can satisfy this condition because 10 has (infinitely many) positive zeros,
S = 0'1' 0'2, ••• (see Fig. 305), with numerical values
0'1
= 2.4048,
0'2
= 5.5201,
0'3
= 8.6537,
0'4
= 11.7915,
0'5
= 14.9309
and so on. (For further values, consult your CAS or Ref. [GRI] in App. l.) These zeros
are slightly inegularly spaced. as we see. Equation (13) now implies
kR = am
(14)
111
thus
=
I, 2, ....
Hence the functions
m = 1,2,'"
(15)
are solutions of (I 1) that are zero on the boundary circle r = R.
Eigenfunctions and Eigenvalues. For Wm in (15), a corresponding general solution of
(10) with A = Am = ckm = camlR is
Hence the functions
with III = 1,2, ... are solutions of the wave equation (7) satisfying the boundary condition
(8). These are the eigenfunctions of our problem. The corresponding eigenvalues are Am.
The vibration of the membrane conesponding to Urn is called the 111th normal mode;
it has the frequency Am l27r cycles per unit time. Since the zeros of the Bessel function 10
are not regularly spaced on the axis (in contrast to the zeros of the sine functions appearing
in the case of the vibrating string), the sound of a drum is entirely different from that of
a violin. The fonTIs of the normal modes can easily be obtained from Fig. 305 and are
shown in Fig. 306. For 111 = I, all the points of the membrane move up (or down) at the
same time. For 111 = 2, the situation is as follows. The function W2(r) = 10 (a 2r1R) is zero
for a2r1R = 0'1' thus r = a1R1a2' The circle r = alR1cx2 is, therefore, nodal line, and
when at some instant the central part of the membrane moves up, the outer part
(r > a l Rl(2 ) moves down. and conversely. The solution um(r. t) has 111 - I nodal lines,
which are circles (Fig. 306).
)
-10
/
\
5
~
10
~'~__~~__~~__~~__~__~__-L~__~'~~~_ __ _
-04
-03
01 - - - - / 02
Fig. 305.
Bessel function Jo (5)
03
04
s
SEC. 12.9
583
Laplacian in Polar Coordinates. Circular Membrane. Fourier-Bessel Series
I
I
I
I
i~.1
{" i __ . 1 t(
I
I
f
1/ 1 ~_
I
1/
\ I
I
1'-. I
1
1
1
\
'''-.
--~
m= 1
Fig. 306.
-'
,I
',1
I
/
m=3
m=2
Normal modes of the circular membrane in the case of vibrations
mdependent of the angle
Step 3. Solution of the Entire Problem
To obtain a solution lI(r, t) that also satisfies the initial conditions (9), we may proceed
as in the case of the string. That is. we consider the series
(17)
u(r, t)
= ~1
W"lr)Gm(t)
= ~1
(Am cos Am!
+
Bm sin A",I) 10 (
(leaving aside the problems of convergence and uniqueness). Setting!
we obtain
=
~n 1')
0 and using (9a).
(18)
Thus for the series (17) to satisfy the condition (9a), the constants Am must be the
coefficients of the Fourier-Bessel series (18) that represents fer) in terms of 10 (O'm rlR );
that is [see (10) in Sec. 5.8 with 11 = O. O'O,rn =
and x = 1'1,
am,
(19)
Am
=
2
R II
2
2
(O'Ul)
JR rf(r)1 (am)r dr
0
0
-
R
(111
= 1, 2, .. ').
Differentiability of fer) in the interval 0 ~ r ~ R is sufficient for the existence of the
development (18); see Ref. [Al3]. The coefficients Em in (17) can be determined from
(9b) in a similar fashion. Numeric values of Am and Em may be obtained from a CAS or
by a numeric integration method. using tables of 10 and 11 , However, numeric integration
can sometimes be avoided, as the following example shows.
584
E X AMP L E 1
CHAP. 12
Partial Differential Equations (PDEs)
Vibrations of a Circular Membrane
Find the vibrations of a circular drumhead of radius I ft and density 2 slugs/ft2 if the tension is 8 Iblft, the initial
velocity is O. and the initial displacement is
f(,.)
Solutioll.
=
I -
r2
[ftl.
c 2 = TIp = 8/2 = 4 [ft2/sec 2 Also Bm = 0, since the initial velocity is O. From (19) and Example
1.
3 in Sec. 5.8, since R = I, we obtain
4J2 (am )
a",21r 2(a",)
8
where the last equality follows from (24c). Sec. 5.5, with v = I, that b.
J2{am ) = -
2
Urn
2
h(crm ) - JO(u,n) = J 1(am )·
am
Table 9.5 on p. 409 of [GRI] gives lYm dnd J~(a",). From this we get h(am) = -J~(a",) by (24b), Sec. 5.5.
with v = 0, and compute the coefficients Am:
171
am.
11 (0'",)
12 (0CyJ
Am
2
3
4
5
6
7
8
9
10
2.40483
5.52008
8.65373
11.79153
14.93092
18.07106
2l.21164
24.35247
27.49348
30.63461
0.51915
-0.34026
0.27145
-0.23246
0.20655
-0.18773
0.17327
-0.16170
0.15218
-0.14417
0.43176
-0.12328
0.06274
-0.03943
0.02767
-0.02078
0.01634
-0.0l328
0.01107
-0.00941
1.10801
-0.l3978
0.04548
-0.02099
0.01164
-0.00722
0.00484
-0.00343
0.00253
-0.00193
Thus
fer) = 1.108Jo(2.4048,.) - 0.140Jo(5.520Ir)
+ 0.045Jo(!!.6537r) - ....
We see that the coefficients decrease relatively slowly. The sum of the explicitly given coefficients in the table
is 0.99915. The sum of all the coefficients should be I. (Why?) Hence by the Leibniz test in App. A3.3 the
partial sum of those terms gives about three correct decimals of the amplitude (fr).
Since
from (17) we
tl1U~
obtain the solution (with,. measured in feet and t in seconds)
lI{r. t) = 1.1 08Jo(2.4048r) cos 4.8097 t - 0.140J0<5.5201 r) cos 11.0402t + 0.045JO<8.6537r) (;O~ 17.3075t - ....
In Fig. 306, m = I gives an idea of the motion of the tlrst term of our series, 111 = 2 of the second term, and
= 3 of the third term, so that we can "see" our result about as well as for a violin string in Sec. 12.3.
•
111
SEC 12.9
Laplacian in Polar Coordinates. Circular Membrane. Fourier-Bessel Series
========= -.
3:£2:--
SET
1. Why did we use polar coordinates in this section?
2. Work out the details of the calculation leading
to
with arbitrary Ao and
the
I
Laplacian in polar coordinates.
3. If
--n-_--cl
e,
7rIlR
is independent of
then (5) reduces to
y 2 u = II'T + uTlr. Derive this directly from the
Laplacian in Cartesian coordinates.
l/
585
1 il
riJU)
4. An alternative form of (5) is v2 u = r ill' (
ilr
iJ211
+ r2 ae 2 . Derive this from (5).
7rIlR
TI"
f(A) cos nA
de,
-TI"
I
Bn =
f
n-l
f
"
fee) sin lie de.
_ ..
(e) Compatibility condition Show that (9), Sec. 10.4,
impo~es on f(O) in (d) the "compatibility condition"
5. (Radial solution) Show that the only solution of
y 2 u = 0 depending only on r = V.~ +
u = a In r + b with constant a and b.
6. TEAM PROJECT.
Nemnann Problems
Series
for
i
is
Dirichlet
and
"n
(a) Show that lin = 1'71 cos lie.
= rn sin ne, II = 0,
I, ... , are solutions of Laplace's equation -V2 u = 0
with ,211 given by (5). (What would Un be in Cartesian
coordinates'? Experiment with small II.)
(b) Dirichlet problem (See Sec. 12.5) Assuming that
term wise differentiation is permissible. show that a
solution of the Laplace equation in the disk r < R
satisfying the boundary condition u(R, e) = I(e)
(f given) is
u(r, B> =
00
x
+ ~l
(20)
+
bn
[
an (r)n
Ii cos lie
f (see
e
(d) Neumann problem Show that the solution of the
Neumann problem y211 = 0 if r < R, llN(R, e) = f(B)
(where LIN = iJ"/iJN is the directional de11vative in the
direction of the outer normal) is
0)
= Ao
+
L
n~1
ELECTROSTATIC POTENTIAL.
STEADY-STATE HEAT PROBLEMS
The electrostatic potential satisfies Laplace's equation
'V 2 11 = 0 in any region free of charges. Also the heat
equation lit = C 2 ,211 (Sec. 12.5) reduces to Laplace's equation
if the temperature u is tinIe-independent ("steady-state
case"). Using (20), find the potential (equivalently: the
steady-state temperature) in the disk r < I if the boundary
values are (sketch them, to see what is going on).
7. u(l. 01 = 40 cos3 0
8. u( I, e)
800 sin3 0
9. 1I(l, 0)
< e < ~7r and 0 otherwise
< e < ~7r and 0 otherwise
I0 I if - 7r < 0 < 7r
0 2 if - 7r < 0 < 7r
12. lI( I. 0)
(c) Dirichlet problem Solve the Dirichlet problem
using (20) if R = I and the boundary values are
u(O) = -100 volts if -7r < 0 < O. u(O) = 100 volts
if 0 < < 7r. (Sketch this disk, indicate the boundary
values.)
u(r,
17-121
11. u(l, 0)
nO
where (In' bn are the Fourier coefficients of
Sec. 11.I).
o.
10. u( I, e)
( r)n sin ]
R
(f) Neumann problem Solve y 2 u = 0 in the annulus
I < r < 3 if liTO, 0) = sin 0, U,(3, e) =
rn(An cos IlO + Bn sin
lie)
I IO if
o if
-!7r
-!7r
13. CAS EXPERIMENT. Equipotential Lines. Guess
what the equipotential lines tI(r, e) = const in Probs.
9 and 11 may look like. Then graph some of them,
using partial sums of the series.
14. (Semidisk) Find the electrostatic potential in the
semidisk r < I, 0 < < 7r which equals I IO O( 7r - B>
on the semicircle I' = I and 0 on the segment
e
-I<x<l.
15. (Semidisk) Find the steady-state temperature in a
semicircular thin plate r < a, 0 <
< 7r with the
semicircle I' = a kept at constant temperature 110 and
the segment - ( l < X < a at O.
e
16. (Illvariance) Show that y 2 u is invariant under
translations x* = x + (l, y* = Y + b and under rotations
x* = x cos a - y sin a, y* = x sin a + y cos a.
586
CHAP. 12
Partial Differential Equations (PDEs)
CIRCULAR MEMBRANE
17. (Frequency) What happens to the frequency of an
eigenfunction of a dfilm if you double the tension?
18. (Size of a drum) A small dfilm should have a higher
fundamental frequency than a large one, tension and
density being the same. How does this follow from our
formulas?
19. (Tension) Find a formula for the tension required to
produce a desired fundamental frequency f I of a
drum.
20. CAS PROJECT. Normal Modes. (a) Graph the
nOimal modes 114' 115' 116 as in Fig. 306.
(b) Write a program for calculating the Am's in
Example 1 and extend the table to III = 15. Verify
numerically that am = (Ill - ~) 7T and compute the
error for 111 = 1, . . . , 10.
(c) Graph the initial deflection fer) in Example 1 as
well as the fIrst three partial sums of the series.
Comment on accuracy.
(d) Compute the radii of the nodal lines of U2' U3' 114
when R = I. How do these values compare to those of
the nodes of the vibrating string of length I? Can you
establish any empirical laws by experimentation with
further 11m?
21. (Nodal lines) Is it possible that for fixed L" and R two
or more II", [see (16)] with ditlerent nodal lines
correspond to the same eigenvalue? (Give a reason.)
22. Why is Al + A2 + ... = 1 in Example I? Compute
the first few partial sums until you get 3-digit accuracy.
What does this problem mean in the field of music?
0,
(24)
where A = ck,
r
Show that the PDE can now be separated by
substituting F = W(r)Q(O), giving
(25)
25. (Periodicity) Show that Q(8) must be periodic with
period 27T and. therefore. 11 = 0, 1, 2•••. in (25) and
(26). Show that this yields the solutions Qn = cos 110,
Qn * = sin nO, Wn = In(kr), 11 = 0, 1, . . . .
26. (Boundary condition) Show that the boundary
condition
(27)
u(R. O. t) = 0
leads to k = k mn = amnlR, where s = a mn is the mth
positive zero of In(s).
27. (Solutions depending on both rand 8) Sho\\ that
solutions of (22) satisfying (27) are (see Fig. 307)
(28)
23. (Nonzero initial velocity) Show that for (17) to satisfy
(9b) we must have
(21)
X
f
R
o
rg(r)J o(a 1ll rlR) dr.
VIBRATIONS OF A CIRCULAR MEMBRANE
DEPENDING ON BOTH rAND (J
24. (Separations) Show that substitution of II = F(r, (})G(t)
into the wave equation (6), that is,
gives an ODE and a PDE
CD
Fig. 307.
Nodal lines of some of the solutions (28)
28. (Initial condition) Show that
Bmn = 0, Bi;,n = 0 in (28).
29. Show that II~,O = 0 and
the current section.
UmO
II t {r,
O. 0) = 0 gives
is identical with (16) in
30. (Semicircular membrane) Show that Ull represents
the fundamental mode of a semicircular membrane and
fInd the corresponding frequency when c 2 = I and
R = 1.
SEC 12.10
Laplace's Equation in Cylindrical and Spherical Coordinates Potential
12.1 0
587
Laplace's Equation in Cylindrical and
Spherical Coordinates. Potential
Laplace's equation
(1)
is one of the most important PDEs in physics and its engineering applications. Here,
x, y, z are Cartesian coordinates in space (Fig. 165 in Sec. 9.1), /l xx = a2u/ax2, etc. The
expression V 2 u is called the Laplacian of u. The theory of the solutions of (1) is called
potential theory. Solutions of (I) that have COlltillUOUS second partial derivatives are
known as harmonic functions.
Laplace's equation occurs mainly in gravitation, electrostatics (see Theorem 3,
Sec. 9.7). steady-state heat flow (Sec. 12.5), and fluid flow (to be discussed In
Chap. 18.4).
Recall from Sec. 9.7 that the gravitational potential u(x. y.::) at a point (x. y. z) resulting
from a single mass located at a point (X. Y. Z) is
(2)
u(x, y, z) =
c
c
V(x -
r
X)2
+
(y - y)2
+
(r> 0)
(z - Z)2
and u satisfies (1). Similarly, if mass is distributed in a region T in space with density
p(X, Y, Z), its potential at a point (x, y, ::) not occupied by mass is
(3)
u(x, y, z)
=
k
III
p(X, Y, Z)
T
dX dY dZ.
r
= 0 (Sec. 9.7) and p is not a function of x, y, ::.
Practical problems involving Laplace's equation are boundary value problems in a
region T in space with boundary surface S. Such a problem is called (see also Sec. 12.5
for the two-dimensional case):
It satisfies (I) because V2(\/r)
(I) First boundary value problem or Dirichlet problem if u is prescribed on S.
(II) Second boundary value problem or Neumann problem if the normal
derivative Un = au/an is prescribed on S.
(III) Third or mixed boundary value problem or Robin problem if II is prescribed
on a portion of S and
lin
on the remaining portion of S.
Laplacian in Cylindrical Coordinates
The first step in solving a boundary value problem is generally the introduction of
coordinates in which the boundary surface S has a simple representation. Cylindrical
related to x,
symmetry (a cylinder as a region T) calls for cylindrical coordinates r,
y, z by
e, ::
(4)
x = r cos
e,
y
= r sin e,
z=z
(Fig. 308, p. 588).
588
CHAP. 12
Partial Differential Equations (PDEs)
2
2
'i'
(r.e.z)
Ir,8,1/»
1
1
1
r
Iz
:-----
"
e r"
e
Y
1
,I
"
x
Fig. 308.
x
Cylindrical coordinates
Fig. 309.
For these we get y 2 U immediately by adding
U zz
1-1
1
1
1
1
Y
'I
',I...
Spherical coordinates
to (5) in Sec. 12.9; thus,
(5)
Laplacian in Spherical Coordinates
Spherical symmetry (a ball as region T bounded by a sphere S) requires spherical
z by
coordinates r, e, lb related to x, y,
(6)
x
= r cos e sin efy.
y=
r sin
e sin efy.
(Fig. 309).
z = r cos efy
Using the chain rule (as in Sec. 12.9), we obtain V- 2 u in spherical coordinates
(7)
We leave the details as an exercise. It is sometimes practical to write (7) in the form
I
(7)
2
I [a
Yu=r2
ar
(2r -au)
ar
2
I -a (.smefyau ) + 1 -au] .
+2
2
sin efy aefy
defy
sin efy
ae
Remark on Notation.
Equation (6) is used in calculus and extends the familiar notation
for polar coordinates. Unfortunately, some books use e and efy interchanged, an extension
of the notation x = r cos efy, y = r sin efy for polar coordinates (used in some European
countries).
Boundary Value Problem in Spherical Coordinates
We shall solve the following Dirichlet problem in spherical coordinates:
(8)
(9)
(10)
a
[ -ar
(2r -au)
+ -- -a
ar
sin efy aefy
I
u(R,
efy)
= J(efy)
lim u(r, efy) = O.
'1"-->00
(.
SIll efy -au ) ] =
aefy
o.
SEC. 12.10
Laplace's Equation in Cylindrical and Spherical Coordinates. Potential
589
The PDE (8) follows from (7) by assuming that the solution £I will not depend on ebecause
the Dirichlet condition (9) is independent of e. This may be an electrostatic potential (or
a temperature) J(ep) at which the sphere S: r = R is kept. Condition (10) means that the
potential at infinity will be zero.
Separating Variables by substituting u(r. ep) = C(r)H(ep) into (8). MUltiplying (8) by
r2, making the substitution and then dividing by CH, we obtain
I
G
(2r dr
dC)
d
dr
=
-
I
d (.
dH )
H sin lb dlb sm ep dlb .
By the usual argument both sides must be equal to a constant k. Thus we get the two
ODEs
I
(11)
d
C dr (
=k
r2 dC)
dr
or
and
~
(12)
(Sin cb
sin ep dep
dH)
dep
+ kH = O.
The solutions of (11) will take a simple form if we set k
C' = dC/dr, etc., we obtain
r 2C"
(13)
+
2rC' -
n(1l
+
=
n(n
+ 1). Then, writing
l) C = O.
This is an Euler-Cauchy equation. From Sec. 2.5 we know that it has solutions C
Substituting this and dropping the common factor r a gives
a(a -
I)
+
2a - n(n
+ 1)
=
O.
a = nand
The roots are
Hence solutions are
(14)
and
We now solve (12). Setting cos ep
d
Consequently, (12) with k
(15)
=
n(n
= w, we have sin2 lb = 1 d
dep
dw
dl!' dep
+
C~(r) =
d
= -sin ep - .
dl\'
1) takes the form
d [ (I - w 2 ) -dH]
-d
II'
dw
+ n(n +
l)H
= O.
w 2 and
-n -
= ra
I
590
CHAP. 12
Partial Differential Equations (PDEs)
This is Legendre's equation (see Sec. 5.3), written out
(1 - w 2 )
(15')
d 2H
dH
2w -
-- -
d1l'2
dw
+
n(n
+ 1)H = O.
For integer II = 0, 1, ... the Legendre polynomials
11
= 0,1, " . ,
are solutions of Legendre's equation (15). We thus obtain the following two sequences
of solution II = GH of Laplace's equation (8), with constant An and Bn, where
n = 0, 1, ... ,
(16)
(a)
(b)
Use of Fourier-Legendre Series
Interior Problem: Potential Within the Sphere S.
We consider a series of terms from
(16a),
lI(r. ¢)
(17)
=
:L
Anrnpn(cos ¢)
(r
~
R).
n~O
Since S is given by r
we must have
= R, for (17) to satisfy the Dirichlet condition (9) on the sphere S,
(18)
n=O
that is, (18) must be the Fourier-Legendre series of f(¢). From (7) in Sec. 5.8 we get
the coefficients
AnRn =
(19*)
211
+
I
2
II -
f(w)Pn(w) dw
-1
where few) denotes f(¢) as a function of IV = cos ¢. Since dll· = -sin ¢ d¢, and the
limits of integration -I and I correspond to ¢ = 7T" and ¢ = 0, respectively, we also
obtain
(19)
An
=
2n
+
lR
n
I
1
7r
0
f(¢)Pn(cos ¢) sin ¢ d¢,
11
= 0,1, ...
If f(¢) and j'(¢) are piecewise continuous on the interval 0 ~ ¢ ~ 7T", then the series (17)
with coefficients (19) solves our problem for points inside the sphere because it can be
shown that under these continuity assumptions the series (17) with coefficients (19) gives
the derivatives occuning in (8) by termwise differentiation, thus justifying our derivation.
SEC. 12.10
591
laplace's Equation in Cylindrical and Spherical Coordinates. Potential
Exterior Problem: Potential Outside the Sphere S. Outside the sphere we cannot use
the functions Un in (16a) because they do not satisfy (to). But we can use the ll~ in (16b).
which do satisfy (to) (but could not be used inside S; why?). Proceeding as before leads
to the solution of the exterior problem
(20)
(r
~
R)
satisfying (8), (9), (10), with coefficients
(21)
En
=
+
2n
1
Rn + 1
2
IT.f(c/J)Pn(cos c/J) sin c/J dd>.
0
The next example illustrates all this for a sphere of radius I consisting of two hemispheres
that are separated by a small strip of insulating material along the equator, so that these
hemispheres can be kept at different potentials OW V and 0 V).
E X AMP L E 1
Spherical Capacitor
Find the potential inside and outside a spherical capacitor consisting of two metallic hemispheres of radius I ft
separated by a small slit for reasons of insulation, if the upper hemisphere is kept at 110 V J.nd the lower i~
grounded (Fig. 310).
Solutioll.
The given boundary condition is (recall Fig. 309)
IIO
{ 0
f(<I» =
0""-<1><71"/2
if
if
71"/2
< <I> ""-
71".
Since R = I, we thus obtain from (19)
2n
+
I
An = - - - . 110
2
I
I
.,,/2
Pn(COS <1» sin <I> d<l>
0
2n + I
- - 2 - . 110
1
Pn(w) dw
o
where U' = cos <1>. Hence Pn(COS <1» sin <1> d<1> = -p ..(w) tin'. we integrate from I to O. and we finally get rid
of the minus by integrating from 0 to I. You can evaluate this integral by your CAS or continue by using (II)
in Sec. 5.3, obtaining
M
L
An = 55(211 + 1)
m~O
where M = nl2 for even nand M =
(II -
(-I)'" "n 1
(2n - 2m)!
1
~ m.(n -
m).(n -
2)1
II!.
II
1)/2 for odd n. The integral equals lI(n - 2m
110 volts
x
Fig. 310.
".n-2m dll"
0
y
Spherical capacitor in Example 1
+
I). Thus
592
CHAP. 12
(22)
Partial Differential Equations (PDEs)
An
=
55(211 + 1) ~
m
(211 - 2m)!
21Z
L.J (-I)
m!(11 - m)!(11 - 2m
·m=O
Taking 11 = 0, we get Ao = 55 tsince O! = 1). For
11
=
I, 2, 3,' .. we get
2!
165
2
165
2 '
0!1!2!
275 ( 4!
A2 = - 0!2!3!
4
385
A3 =
2! )
1!1!l !
(6!
0!3!4!
--
8
+ I)!
4!)
1!2!2!
~
0,
385
8
---
etc.
Hence the potential (17) inside the sphere is (since Po = 1)
(23)
u(r, </J) = 55
+
165
2
r P1tcos </J) -
385 3
-8- r P 3 (cos </J)
+ ...
(Fig. 311)
with PI, P3 , ... given by (Il '), Sec. 5.3. Since R = I, we see from (19) and (2l) in this section that
= An' and (20) thus gives the potential outside the sphere
En
55
(24)
u(r, </J) =
r
+
165
-2-
2r
PI (cos </J) -
385
-4-
8r
P 3 (cos </J)
+ ....
Partial sums of these series can now be used for computing approximate values of the inner and outer potential.
Also, it is interesting to see that far away from the sphere the potential is approximately that of a point charge,
•
namely, 55/r. (Compare with Theorem 3 in Sec. 9.7.)
y
o
Jr
IT
2
Partial sums of the first 4, 6, and 11
nonzero terms of (23) for r = R = 1
Fig. 311.
E X AMP L E 2
Simpler Cases. Help with Problems
The technicalities occurring in cases like that of Example I can often be avoided. For instance, find the potential
inside the sphere S: r = R = I when S is kept at the potential j(</J) = cos 2</J. (Can you see the potential on S?
What is it at the North Pole? The equator? The South Pole? I
Solution.
2
w = cos </J, cos 2</J = 2 cos </J -
1 = 2w
2
-
I = ~P2(W) - ~ = ~(~w2 - ~) - ~. Hence the
potential in the interior of the sphere is
•
SEC. 12.10
-_. ..........-............... ---.
.-.-...
1. Derive (7) from V2 11 in Cartesian coordinates. (Show
the details.)
2. Find the surfaces on which the functions
zero.
3. Sketch the functions P,,(cos c/J) for
(11') in Sec. 5.3).
4. Sketch the functions
P3(COS
5. Verify that
* in (16)
!:!il
lin
and
lin
c/J) and
II
=
are
l/I' lI2. l/3
O. I. 2 (see
c/J).
+
y2
+ :2
is
II
13. f(tb) = 100
16. f( c/J)
sin 2 c/J
17. f( c/J)
35 cos 4c/J
+
20 cos 2tb
+
9
= elr + k with constant c
sphere is the same as that of a point charge at the origin.
Is this physically plausible?
19. Sketch the intersection of the equipotential surfaces in
Prob. 14 with the xo-plane.
7. (Dimension 3) Verify that
Vx 2
Find the potential in the interior of the sphere S: r = R = I
if this interior is free of charges and the potential on Sis:
18. Show that in Prob. 13 the potential exterior to the
and k.
=
BOUNDARY VALUE PROBLEMS IN
SPHERICAL COORDINATES r, 8, c/J
15. f( cp) = cos 3c/J
are solutions of (8).
POTENTIALS DEPENDING ONLY ON r
r = Vx 2
113-171
14. f(c/J) = cos c/J
P 4 (cos
6. (Dimension 3) Show that the only solution of the
Laplace equation depending only on
r
593
Laplace's Equation in Cylindrical and Spherical Coordinates. Potential
+
y2
1I =
dr.
+ :2. satisfies Laplace's equation in
spherical coordinates.
8. (Dirichlet problem). Find the electrostatic potential
between two concentric spheres of radii rl = 10 cm
and r2 = 20 em kept at potentials VI = 260 V amI
V 2 = 110 V. respectively.
9. (Dimension 2. logarithmic potential) Show that the
onl) solution of the two-dimensional Laplace equation
depending only on r =
with constant c and k.
V>;;2
+
.1'2 is
l/
=
c In r
+
k
10. (Logarithmic potential) Find the electrostatic potential
between two coaxial cylinders of radii r l = 10 cm and
r2 =
20 cm kept at potentials VI = 260 V and
V 2 = 110 V. respectively. Compare with Prob. 8.
Comment.
11. (Heat problem) If the sUiface of the ball
2
r2 = x
+ y2 + :2 ~ R2 is kept at temperature
zero and the initial temperature in the ball is f( 1').
show that the temperature u(r, t) in the ball is a solution
of lit = c 2 (u r l" + 2u,./r) satisfying the conditions
u(R.1) = O. u(r, 0) = fer). Show that setting v = ru
gives v t = C2VJT' vCR. t) = O. vCr. 0) = rf(r). Include
the condition v(O. t) = 0 (which holds because II must
be bounded at r = 0). and solve the resulting problem
by separating variables.
20. Find the potential exterior to the sphere in Example 2
of the text and in Prob. 15.
21. What is the temperature in a ball of radius I and of
homogeneous material if its lower boundary
hemisphere is kept at O°C and its upper at 100°C?
22. (Renection in a sphere) Let r, 0, c/J be spherical
coordinates. If u(r. O. c/J) satisfies V 2 11 = O. show that
vCr. O. c/J) = lI( Ilr. O. c/J)lr satisfies V 2 v = O. What
does this give for (l6)?
23. (Renection in a circle) Let r, 0 be polar coordinates.
If lI(r. 0) satisfies V2 l/ = 0, show that the function
v(r. 0) = lI(llr, 0) satisfies V2 v = O. What are
l/ = r cos 0 and v in terms of x and y? Answer the
same question for u = r2 cos 0 sin 0 and v.
24. TEAM PROJECT. Transmission Line and Related
PDEs. Consider a long cable or telephone wire
(Fig. 312) that is imperfectly insulated. so that leaks
occur along the entire length of the cable. The source
S of the current i(x, t) in the cable is at x = 0, the
receiving end T at x = I. The current flows from S to
T. through the load, and returns to the ground. Let the
constants R, L. C. and G denote the resistance,
inductance, capacitance to ground. and conductance to
ground. respectively. of the cable per unit length.
12. (Two-dimensional potential problems) Show that the
functions x 2 - )'2. XY. xl(x 2 + y2). eX cos y. r~ sin r.
cos x cosh y. I~ (x 2 ' + y2). and arctan <.,:Ix) satisfy
Laplace's equation "xx + l/yy = O. (Two-dimensional
potential problems are best solved by complex
allalysis, as we shall see in Chap. 18.)
x=O
Fig. 312.
x=l
Transmission line
CHAP. 12
594
Partial Differential Equations (PDEs)
(d) Telegraph equations. For a submarine cable,
G is negligible and the frequencies are low. Show that
this leads to the so-called submarine cable equations
or telegraph equations
(a) Show that ("first transmission line equation")
au
ax
ai
at
=Ri + L -
- -
where u(x, t) is the potential in the cable. Hint: Apply
Kirchhoff's voltage law to a small portion of the cable
between x and x + !n (difference of the potentials at
x and x + !n = resistive drop + inductive drop).
Find the potential in a submarine cable with ends
(x = 0, x = l) grounded and initial voltage distribution
Va = const.
(e) High-frequency line equations. Show that in the
case of alternating currents of high frequencies the
equations in (c) can be approximated by the so-called
high-frequency line equations
(b) Show
that for the cable in (a) ("second
transmission line equation"),
-
ai
ax
-
=
Gu
au
at
+ C-
Hint: Use Kirchhoff's current law (difference of the
currents at x and x + LlX = loss due to leakage to
ground + capacitive loss).
(c) Second-order PDEs. Show that elimination of
i or u from the transmission line equations leads to
Solve the first of them, assuming that the initial
potential is
Vo sin (7fx/l).
u xx
=
ixx =
12.11
LCutt
LCi tt
+
(RC
+
(RC
+
+
GLhl t
+
RGu.
GL)i t
+
RGi.
and ut(x. 0) = 0 and u = 0 at the ends x = 0 and
= I for all t.
x
Solution of PDEs by Laplace Transforms
Readers familiar with Chap. 6 may wonder whether Laplace transforms can also be used
for solving partial differential equations. The answer is yes, particularly if one of the
independent variables ranges over the positive axis. The steps to obtain a solution are
similar to those in Chap. 6. For a PDE in two variables they are as follows.
1. Take the Laplace transform with respect to one of the two variables, usually t. This
gives an ODE for the transform of the unknown function. This is so since the
derivatives of this function with respect to the other variable slip into the transformed
equation. The latter also incorporates the given boundary and initial conditions.
2. Solving that ODE, obtain the transform of the unknown function.
3. Taking the inverse transform, obtain the solution of the given problem.
If the coefficients of the given equation do not depend on t, the use of Laplace transforms
will simplify the problem.
We explain the method in terms of a typical example.
E X AMP LEI
Semi-Infinite String
Find the displacement It'(X. t) of an elastic string subject to the following conditions. (We write
u to denote the unit step function.)
(i)
The suing is initially at rest on the x-axis from x
(ii)
For t > 0 the left end of the string
sine wave
(x = 0)
w(O, t) = f(t) =
= 0 to
00
IV
since we need
("semi-infiniTe string").
is moved in a given fashion, namely, according to a single
sin t
{ 0
if 0
~
t
~
27T
otherwise
(Fig. 313).
SEC. 12.11
595
Solution of PDEs by Laplace Transforms
Fig. 313.
Motion of the left end of the string in Example 1 as a function of time t
(iii) Furthermore,
lim w(x, t) = 0
~
for t
O.
X_Xl
Of course there is no infinite string, but our model describes a long string or rope (of negligible weight) with
its right end fixed far out on the x-axis.
Solutioll.
We have to solve the wave equation (Sec. 12.2)
(1)
p
for positive x and t, subject to the "boundary conditions"
lim w(x. t) = 0
w(O. t) = JU).
(2)
(t ~ 0)
X_!]C
with
J as given above.
and the initial conditions
(3)
"'(x, 0) = 0,
(a)
wt(x, 0) =
(b)
o.
We take the Laplace transform with respect to t. By (2) in Sec. 6.2.
The expression - sw(x. 0) - "'t(x, 0) drops out because of (3). On the right we assume that we may interchange
integration and differentiation. Then
2
;£
2
{a[Ix J= J('°e_ st
2
iJ :, dt
ax
:
o
=
iJ
ax 2
2
(Oe-stw(x, t) dt
Jo
=
iJ
[I~2
;£(w(x, t)},
Writing W(x, s) = ;£{w(x, tl}, we thus obtain
s2W = c
2
a2 w
thus
-2-'
ax
Since this equation contains only a derivative with respect to x, it may be regarded as an ordillary differelltial
equatioll for W(x, s) considered as a function of x. A general solution is
W(x, s)
(4)
=
A(s)esx1c
+ B(s)e-sxlc.
From (2) we obtain. writing F(s) = ;£{.f(tl}.
W(Q. s)
=
~'(1I'W.
t)) = ;£{J(t») = Frs).
Assnming that we can interchange integration and taking the limit, we have
00
lim W(.\", s) = lim
x-x
x.-oo
00
( e -stw(x, t) tit = ( e -st lim w(x. t) tit = O.
Jo
Jo
x_oo
This implies A(s) = 0 in (4) because c > O. so that for every fixed positive s the function eSx!c increases as x
increases. Note that we may assume s > 0 since a Laplace transfonn generally exists for all s greater than some
fixed k (Sec. 6.2). Hence we have
W(O, s)
=
B(s) = F(s),
596
CHAP. 12
Partial Differential Equations (PDEs)
so that (4) becomes
Wtx, s) = F(s)e -sxle.
From the second shifting theorem (Sec. 6.3) with a = xle we obtain the inverse transform
(5)
W(x, t)
=
+- +~)
~)
(Fig. 314)
that is,
W(x,
t)
= sin
(t - ~)
x
x
c
C
- < t < - + 27T
if
ct> x > (t - 27T)C
or
and zero otherwise. This is a single sine wave traveling to the right with speed c. Note that a point x remains
at rest until t = x/c, the time needed to reach that x if one stalts at t = 0 (start of the motion of the left end)
and travels with speed c The result agrees with our physical intuition. Since we proceeded formally, we must
verify that (5) satisfies the given conditions. We leave this to the student.
•
(t=O)LI______________________
x
(t=2ro~~L-~I---------------~
2nc
x
(t = 4n) LI _ _ _~-~LC---"'----.
'C7
x
(t = 6n) LI______________-.--_-,/-/'-
'C7
Fig. 314.
x
Traveling wave in Example 1
This is the end of Chap. 12, in which we concentrated on the most important partial
differential equations (PDEs) in physics and engineering. This is also the end of Part C
on Fourier analysis and PDEs.
We have seen that PDEs have various basic engineering applications, which make them
the subject of many ongoing research projects.
Numerics for PDEs follows in Secs. 21.4-21.7, which are independent of the other
sections in Part E on numerics.
In the next part, Part D on complex analysis, we tum to an area of a different nature
that is also highly important to the engineer, as our examples and problems will show.
This will include another approach to the (two-dimensional) Laplace equation and its
applications in Chap. 18.
f is
"triangular" as in Example 1, Sec. 12.3.
2. How does the speed of the wave in Example I depend
on the tension and on the mass of the string?
3. Verify the solution in Example 1. What traveling wave
do we obtain in Example I in the case of a
1. Sketch a figure similar to Fig. 314 if c = 1 and
(nonterminating) sinusoidal motion of the left end
starting at t = O?
/4-6/
SOLVE BY LAPLACE TRANSFORMS
aw
aw
+ xax
at
4. -
=
x, w(x, 0) = 1, w(O, t) =
Chapter 12 Review Questions and Problems
aw
s.xax
+
aw
at
Applying the convolution theorem. show that
=
Xl,
w(x, 0) = 0 if x
w(O, t)
6.
a2w
ax2
597
a2w
1002
ot
OW
+ 100+
at
o if
t
~
~
0,
0
25w,
w(x, 0) = 0 if x ~ 0, "'t(x, 0) = 0 if t
w(O, t) = sin t if t ~ 0
~
2C~;;
w(x, t) =
0,
L
t
J(t - T)T-3/2e-x2t(4c?T) dT.
9. Let w(O, t) = J(t) = u(t) (Sec. 6.3). Denote the
con-esponding w. W, and F by wo, Wo, and Fo. Show
that then in Prob. 8.
I
2cV;
t
wo(x, t) = __l : _
7. Solve Prob. 5 by another method.
18-101
T-3/2e-x2t(4c?T) dT
0
l-erf(~~)
HEAT PROBLEM
Find the temperature w(x, t) in a semi-infinite laterally
insulated bar extending from x = 0 along the x-axis
to infinity, assuming that the initial temperature is 0,
w(x, t) -> 0 as x -> 00 for every fixed 1 ~ 0, and
w(O. t) = J(t). Proceed as follows.
with the en-or function erf as defined in Problem
Set 12.6.
10. (Duhamel's formula4 ) Show that in Prob. 9,
8. Set up the model and show that the Laplace transform
leads to
and the convolution theorem gives Duhamel's formula
and
W = F(s)e-'- sxte
w(x, 1) =
(F = .'i{f}).
awo
J(t - T) - - dT.
o
OT
I
t
:;:....._= .. =. S T ION SAN D PRO B L EMS
1. Write down the three probably most important PDEs
from memory and state their main applications.
2. What is the method of separating variables for PDEs?
Give an example from memory.
3. What is the superposition principle? Give a typical
application.
4. What role did Fourier series play in this chapter? Fourier
integrals?
S. What are the eigenfunctions and their frequencies of the
vibrating string? Of the heat equation?
6. What additional conditions did we consider for the wave
equation? For the heat equation?
7. Name and explain the three kinds of boundary conditions.
8. What do you know about types of PDEs? About
transformation to normal forms?
9. What is d' Alembert's method? To what PDE does it
apply?
10. When and why did we use polar coordinates? Spherical
coordinates?
11. When and why did Legendre's equation occur in this
chapter? Bessel's equation?
12. What are the eigenfunctions of the circular membrane?
How do their frequencies differ in principle from those
of the eigenfunctions of the vibrating string?
13. Explain mathematically (not physically) why we got
exponential functions in separating the heat equation,
but not for the wave equation.
14. What is the en-or function? Why did it occur and where?
15. Explain the idea of using Laplace transform methods
for PDEs. Give an example from memory.
16. For what k and 111 are X4 + kx 2y2 + y4 and
sin nIX sinh y solutions of Laplace's equation?
17. Verify that (x 2 - y2)/(x 2 + y2)2 satisfies Laplace's
equation.
I
1tH-21
18. U yy
19. "xx
20.
21.
u xy
U yy
Solve for
+
+
+
+
II
=
lI(X,
1611 = 0
- 2u = 0
uy + x + y +
y):
Ux
uy
=
O. u(x, 0)
] = 0
= J(x),
22. Find all solution u(x, y)
equation in two variables.
=
4JEAN-MARlE CONSTANT DUHAMEL (1797-1872), French mathematician.
lIyCX,
0)
=
g(x)
F(x)G(y) of Laplace's
CHAP. 12
598
Partial Differential Equations (PDEs)
li3~261
where
Find and sketch or graph (as in Fig. 285
in Sec. 12.3) the deflection u(x, t) of a vibrating string of
length 7r, extending from x = 0 to x = 7r, and
e 2 = TIp = I. starting with velocity 0 and deflection
4sin 2x
23 . .f(x) = sin x 24 • .f(x)
= !7r -
Ix -
x( 7r -
27 • .f(x) = sin (7rxI50)
28 • .f(x) = x(50 - x)
29 . .f(x) = 25 - 125 - xl
30• .f(x) = 4 sin 3 (7rxIlO)
131-331
Find the temperature IItx, t) in a laterally
insulated bar of length 7r. extending from x = 0 to x = 7r.
with e 2 = I for adiabatic boundary condition (see Problem
Set 12.5) and initial temperature
31. 100 cos 4x
32. 3x 2
21x - ~7r1
34. Using partial sums, graph lI(x, t) in Prob. 33 for several
constant f on conunon axes. Do these graphs agree with
your physical intuition?
35. Let .f(x, y) = utx, y, 0) be the initial temperature in a
thin square plate of side 7r with edges kept at onc and
faces perfectly insulated. Separating variables. obtain
from U t = C 2 V 2 U the solution
x
u(x, y, t) =
L L
Tr
0
7C
.f(x, y) sin mx sin ny dx dy.
0
x)y( 7r -
y).
137-=,~
x)
Find the temperature distribution in a laterally
insulated thin copper bar (e 2 = Klpu = 1.158 cm2 /sec),
50 cm long and of constant cross section with endpoints at
x = 0 and 50 kept at O°C and initial temperature
7r -
ff
.f(x, y) = x( 7r -
127-30 I
33.
4
----:2
7r
!7r1
25 . .f(x) = sin x
=
=
36. Find the temperature in Prob. 35 if
3
26 • .f(x)
Bmn
Bmn sin mx sin
II}"
e-
C2
(m2+n2)t
m=l n=l
Transform to normal form and solve (showing
the details!)
37. u xy = U xx
38.
39.
lIxx
U xx
+
+
411 xy
+
4u yy = 0
411yy
=
0
40. 2u xx + SU xy + 2u yy = 0
41. U xx + 2l1 xy + U yy = 0
42. U yy + u xy - 2u xx = 0
L43-4~1
Show that the following membranes of area
with e 2 = I have the frequencies of the fundamental mode
as given (4-decimal values). Compare.
43. Circle: a/(2V:;;:) = 0.6784
44. Square: Iv'l = 0.7071
45. Rectangle (sides 1: 2): ~ = 0.7906
46. Semicircle: 3.832/vs.;;: = 0.7644
47. Quadrant of circle: aI2/(4v'";) = 0.7244
(a 12 = S.13562 = first positive zero of J2 )
,!8-50 1
Find the electrostatic potential in the following
(charge-free) regions:
48. Between two concentric spheres of radii ro and 1"1 kept
at the potentials Uo and u I , respectively.
49. Between two coaxial circular cylinders of radii 1"0 and
1"1 kept at the potential 110 and lI., respectively.
(Compare with Prob. 48.)
50. In the interior of a sphere of radius 1 kept at the
potential .f(c/J) = cos 3c/J + 3 cos c/J (referred to our
usual spherical coordinates).
1
Partial Differential Equations (PDEs)
Whereas ODEs (Chaps. 1-6) serve as models of problems involving only one
independent variable, problems involving (H'O or more independent variables (space
variables or time t and one or several space variables) lead to PDEs. This accounts for
the enonnous importance of PDEs to the engineer and physicist. Most important are:
(1)
U tt
=
(2)
Utt
= c 2 (u xx + U yy )
('2U=
One-dimensional wave equation (Sees. 12.2-12.4)
Two-dimensional wave equation (Sees. 12.7-12.9)
599
Summary of Chapter 12
One-dimensional heat equation (Secs. 12.5. 12.6)
(4)
(5)
+ U yy = 0 Two-dimensional Laplace equation (Secs. 12.5, 12.9)
= U xx + uYV + U zz = 0 Three-dimensional Laplace equation
V 2 u = u"x
2
V u
(Sec. 12.10).
Equations (I) and (2) are hyperbolic. (3) is parabolic. (4) and (5) are elliptic.
In practice, one is interested in obtaining the solution of such an equation in a
given region satisfying given additional conditions, such as initial conditions
(conditions at time t = 0) or boundary conditions (prescribed values of the solution
u or some of its derivatives on the boundary surface S, or boundary curve C, of the
region) or both. For (1) and (2) one prescribes two initial conditions (initial
displacement and initial velocity). For (3) one prescribes the initial temperature
distribution. For (4) and (5) one prescribes a boundary condition and calls the
resulting problem a (see Sec. 12.5)
Dirichlet problem if u is prescribed on S.
Neumann problem if lin = all/an is prescribed on S,
Mixed problem if u is prescribed on one part of S and
lin
on the other.
A general method for solving such problems is the method of separating
variables or product method, in which one assumes solutions in the form of
products of functions each depending on one variable only. Thus equation (1) is
solved by setting lItx, t) = F(x)G(t); see Sec. 12.3; similarly for (3) (see Sec. 12.5).
Substitution into the given equation yields ordinary differential equations for F and
G, and from these one gets infinitely many solutions F = Fn and G = G n such that
the corresponding functions
are solutions of the PDE satisfying the given boundary conditions. These are the
eigenfunctions of the problem. and the corresponding eigenvalues determine the
frequency of the vibration (or the rapidity of the decrease of temperature in the case
of the heat equation. etc.). To satisfy also the initial condition (or conditions). one
must consider infinite series of the Un. whose coefficients tum oul to be the Fourier
coefficients of the functions f and g representing the given initial conditions
(Secs. 12.3, 12.5). Hence Fourier series (and Fourier integrals) are of basic
importance here (Secs. 12.3. 12.5, 12.6, 12.8).
Steady-state problems are problems in which the solution does not depend on
time I. For these, the heat equation lit = C 2 V 2 U becomes the Laplace equation.
Before solving an initial or boundary value problem. one often transforms the
PDE into coordinates in which the boundary of the region considered is given by
simple formulas. Thus in polar coordinates given by x = r cos e. y = r sin e. the
Laplacian becomes (Sec. 12.9)
(6)
y 2U =
1
li,T
+ -
r
U, .
+
""2 u fi /!;
r
for spherical coordinates see Sec. 12.10. If one now separates the variables. one gets
Bessel's equation from (2) and (6) (vibrating circular membrane, Sec. 12.9) and
Legendre's equation from (5) transf01med into spherical coordinates (Sec. 12.10).
.. ..
'
,
PA R T
D
Complex
Analysis
C HAP T E R 13 Complex Numbers and Functions
C HAP T E R 14 Complex Integration
C HAP T E R 1 5 Power Series, Taylor Series
C HAP T E R 1 6 Laurent Series. Residue Integration
C HAP T E R 17 Conformal Mapping
C HAP T E R 18 Complex Analysis and Potential Theory
Many engineering problems can be modeled, investigated, and solved by functions of a
complex variable. For simpler problems, some acquaintance with complex numbers will
suffice. This is true for simpler electric circuits and mechanical vibrating systems. For
more complicated problems in heat conduction, fluid flow, electrostatics, etc., one needs
the theory of complex analytic functions, briefly called complex analysis. The importance
of the latter in applied mathematics has three main reasons:
1. Most importantly, the real and imaginary parts of an analytic function satisfy
Laplace's equation in two real variables. Hence two-dimensional potential problems can
be solved by methods for analytic functions, and this is often simpler than working in
real.
2. Many complicated real and complex integrals in applications can be evaluated by
the elegant methods of complex integration.
3. Most functions in engineering mathematics are analytic functions, and their study
as functions of a complex variable leads to a deeper understanding of their properties and
to interrelations in complex that have no analog in real calculus.
601
CHAPTER
13
Complex Numbers
and Functions
Complex numbers and their geometric representation in the complex plane are discussed
in Secs. 13.1 and 13.2. Complex analysis i:-. concerned with complex analytic functions
as defined in Sec. 13.3. Checking for analyticity is done by the Cauchy-Riemann
equations (Sec. 13.4). These equations are of basic importance, also because of their
relation to Laplace's equation.
The remaining sections of the chapter are devoted to elementary complex functions
(exponential, trigonometric, hyperbolic, and logarithmic functions). These generalize the
familiar real functions of calculus. Their detailed knowledge is an absolute necessity in
practical work. just as that of their real counterparts is in calculus.
Prerequisite: Elementary calculus.
References and Answers to Problems: App. I Part D. App. 2.
13.1
Complex Numbers. Complex Plane
Equations without real solutions, such as x 2 = -1 or x 2 - lOx + 40 = O. were observed
early in history and led to the introduction of complex numbers.1 By definition, a complex
number z is an ordered pair (x, y) of real numbers x and y, written
z = (x, y).
x is called the real part and y the imaginary part of z. written
x = Re;::,
y
=
[m Z.
By definition, two complex numbers are equal if and only if their real parts are equal
and their imaginary parts are equal.
(0, 1) is called the imaginary unit and is denoted by i,
(1)
i
=
(0, 1).
IFirst to use complex numbers for this purpose was the Italian mathematician GIROLAMO CARDANO
(1501-1576). who found the formula for solving cubic equations. The term "complex number" was introduced
by CARL FRlEDRICH GAUSS (see the footnote in Sec. 5.4). who also paved the way for a general u~e of
complex numbers.
602
SEC. 13.1
Complex Numbers. Complex Plane
603
+ iy
Addition, Multiplication. Notation z = x
Addition of two complex numbers ::1 = (Xl> Yl) and ::2 = (.\"2, ."2) is defined by
(2)
Multiplication is defined by
(3)
In particular, these two definitions imply that
(Xl> 0)
+ (X2'
0) =
(Xl
+ X2,
0)
and
(Xl> 0)(X2' 0)
as for real numbers
can thus write
Xl>
(x,O)
=
(XlX2, 0)
X2. Hence the complex numbers
= x.
"extend" the real numbers We
(0, y)
Similarly,
=
iy
because by (1) and the definition of multiplication we have
iy
=
(0, l)y = (0, l)(y, 0) = (0· y - 1·0,
0·0
+ 1 . y) = (0, y).
Together we have by addition (x, y) = (x, 0) + (0, y) = x + iy:
In practice, complex numbers z = (x, y) are written
(4)
z=
x
+ iy
or z = x + .vi, e.g., 17 + 4i (instead of i4).
Electrical engineers often write j instead if i because they need i for the cunent.
If .r = 0, then z = iy and is called pure imaginary. Also, (1) and (3) give
(5)
because by the definition of multiplication, i 2 = ii = (0, 1)(0, 1)
For addition the standard notation (4) gives [see (2)]
=
(-1, 0)
=
-l.
For multiplication the standard notation gives the following very simple recipe. Multiply
each term by each other term and use i 2 = -1 when it occurs [see (3)]:
This agrees with (3). And it shows that x
numbers than (x, y).
+
iy is a more practical nmation for complex
604
CHAP. 13
Complex Numbers and Functions
If you know vectors. you see that (2) is vector addition. whereas the multiplication (3)
has no counterpart in the usual vector algebra.
E X AMP L E 1
Real Part, Imaginary Part, Sum and Product of Complex Numbers
LeI '::1
= 8 + 3i and '::2 = 9
~
2i, Then Re <:1 = 8. Im:::l = 3, Re:::2 = 9, [m:::2 =
:::1 + :::2 = (8 + 3i) + (9
:::1:::2
= (8 + 3i)(9
~
2;)
~
= 72 + 6 +
2i)
=
;(~16
~2
and
17 + i.
•
+ 27) = 78 + IIi.
Subtraction, Division
Subtraction and division are defined as the inverse operations of addition and
multiplication, respectively. Thus the difference z = ZI - ':2 is the complex number.:: for
which ZI = .: + ':2' Hence by (2),
(6)
The quotient z = z1/22 (Z2 *- 0) is the complex number z for which':l = 2Z2' If we equate
the real and the imaginary parts on both sides of this equation, setting 2 = x + iy, we
obtain Xl = X2X - Y2Y')'I = Y2X + X2)" The solution is
(7*)
z=
21
=x +
X l X2
2
X2
x=
iy.
+ )'1."2
+ ."22
y=
Xl."2
X2Yl X2
2
+ )'22
The practical rule used to get this is by multiplying numerator and denominator of z l /z2
by X2 - iY2 and simplifiying:
(7)
Xl
X2
E X AMP L E 2
+
+
iYl
(Xl
iY2
(X2
+ i)'l) (X2
+ i)'2) (x2
i."2)
-
- i."2)
XI X 2
2
X2
+ ."1)'2 +
+ ."22
i
X2Yl 2
X2
X l .\'2
+ yl
Difference and Quotient of Complex Numbers
For :::1
= 8 + 3; and :2 = 9
ZI
~
2; we get:1
8 + 3i
9 ~ 2;
~ :2
= (8 + 3;)
(8
+ 3i)(9 + 2i)
(9
~
Check the division by multiplication to get 8
2i)(9 + 2;)
+
3i.
~
(9
~
2i)
66 + 43i
81 + 4
=
~1
+ 5; and
66
43
+ -i.
85
85
-
•
Complex numbers satisfy the same commutative. associative. and distributive laws as real
numbers (see the problem set).
Complex Plane
This was algebra. Now comes geometry: the geometrical representation of complex
numbers as points in the plane. This is of great practical importance. The idea is quite
simple and natural. We choose two perpendicular coordinate axes, the horizontal x-axis.
called the real axis, and the vertical y-axis, called the imaginary axis. On both axes we
choose the same unit of length (Fig. 315). This is called a Cartesian coordinate system.
SEC. 13.1
605
Complex Numbers. Complex Plane
y
(Imaginary
axis)
y
5
p
z
=X
x
+iy
(Real
C F - - - ! - - - - - - - - - x - axis)
Fig. 315.
-3
The complex plane
----------
4-3i
Fig. 316. The number 4 - 3; in
the complex plane
We now plot a given complex number z = (x. y) = x + iy as the point P with coordinates
x, y. The xy-plane in which the complex numbers are represented in this way is called the
complex plane. 2 Figure 31 t1 shows an example.
Instead of saying "the point represented by z in the complex plane" we say briefly and
simply "the point z in the complex plane." This will cause no misunderstandings.
Addition and subtraction can now be visualized as illustrated in Figs. 317 and 318.
y
y
x
I
I
I
I
I
6---z2
Fig. 317.
Addition of complex numbers
Fig. 318.
Subtraction of complex numbers
Complex Conjugate Numbers
The complex conjugate z of a complex number z = x
z= x
+
iy is defined by
- iy.
It is obtained geometrically by reflecting the point
this for z = 5 + 2i and its conjugate Z = 5 - 2i.
z in the real axis. Figure 319 shows
Y
2
Fig. 319.
~-
z =x + iy = 5 + 2i
Complex conjugate numbers
2Sometimes called the Argand diagram, atter the French mathematician JEAN ROBERT ARGAND
(1768-1822). born in Geneva and later librarian in Paris. His paper on the complex plane appeared in 1806.
nine years after a similar memoir by the Norwegian mathematician CASPAR WESSEL (1745-1818). a surveyor
of the Danish Academy of Science.
CHAP. 13
606
Complex Numbers and Functions
The complex conjugate is important because it permits us to switch from complex
to real. Indeed, by multiplication,
= x 2 + )'2 (verify!). By addition and subtraction.
z + Z = 2x. z = 2iy. We thus obtain for the real part x and the imaginary part y
(not iy!) of::: = .\ + iy the important formulas
zz
z
(8)
Re
I
I
2 (::: + Z),
z=x=
1m z = y = --;;: (z - z).
_I
If z is real, Z = x, then Z = z by the definition of Z, and conversely.
Working with conjugates is easy, since we have
(9)
E X AMP L E 3
Illustration of (8) and (9)
Let Zl
=4 +
3i and :2
=2+
5i. Then by (8),
3i + 3i
I
1m:1 = 2i [(4 + 3i) - (4 - 3i)] = -2-i- = 3.
Also, the multiplication formula in (9) is verified by
(':1':2) ~ (4
Zl::2
===== --.•. ........
:...
-.-.
to.
+ 3i)(2 + 5i)
= (4 - 3i)(2 -
= (-7
•
"1-=-13. (4z 1
2. (Rotation) \1ultiplication by i is geometrically a
counterclockwise rotation through rr12 (90°). Verify
this by graphing <. and iz and the angle of rotarian for
z = 2 + 2i, : = - I - 5i, z = 4 - 3i.
3. (Dhision) Verify the calculation in (7).
116-]2]
4. (Multiplication) If the product of two complex numbers
is zero, show that at least one factor must be zero.
S. Show that: = x + iy is pure imaginary if and onJy
if;: = -:.
6. (Laws for conjugates) Verify (9) for
':2 = 4 + 6i.
Zl =
24
+
10i.
15.
-
+
(Zl
:2)2
z2)/(zl -
+ iy.
Find:
17. Re (lIZ)
18. 1m [0 + i)8;;:2]
19. Re (1/z2)
20. (Laws of addition and multiplication) Derive the
following laws for complex numbers from the
corresponding laws for real numbers.
+
':2)
+
':3 = ':1
+
(:::2
+
':3)'
(Associative laws)
Let': l = 2 + 3i and Z2 = 4 - 5i. Showing the details
of your work. find (in the form x + iy):
8. ;:1;:2
10. Re (:22), (Re
Z2)
Let z = x
16. Im:3, (1m Z)3
(::1
COMPLEX ARITHMETIC
7. (5':1 + 3::z f
9. Re (1/: 1 2 )
= -7 - 26i.
5;) = -7 - 26i.
1. (powersofi)Showthari 2 = -I, i 3 = -i, i4 = I,
;5 = i .... and IIi = -i. Ili 2 = -I. lIi 3 = i .....
17-151
+ 26i)
(ZlZ2)Z3 =
Zl(Z2Z3)
(Distributive law)
o+ Z =
22)2
Z
+ (- z)
= (- z)
+
Z
+
Z =
0
0,
=
z,
Z'
1
z.
SEC. 13.2
13.2
607
Polar Form of Complex Numbers. Powers and Roots
Polar Form of Complex Numbers.
Powers and Roots
The complex plane becomes even more useful and gives further insight into the arithmetic
operations for complex numbers if besides the xy-coordinates we also employ the usual
polar coordinates r. e defined by
(1)
x
We see that then:::
= r cos
e,
y= r sin e.
= x + iy takes the so-called polar form
: : = r(cos e +
(2)
i sin 8).
r is called the absolute value or modulus of.: and is denoted by
1::1
(3)
= r = V.~ + );2 =
Izl. Hence
V2 .
Geometrically, Izl is the distance of the point z from the origin (Fig. 320). Similarly,
1'::1 - :::21 is the distance between Zl and 22 (Fig. 321).
e is called the argument of z and is denoted by arg z. Thus (Fig. 320)
e=
(4)
y
arg::: = arctan .:....
(z
X
*" 0).
Geometrically, e is the directed angle from the positive x-axis to OP in Fig. 320. Here. as
in calculus, all angles are measured in radians and positive in the counterclockwise sense.
For z = 0 this angle e is undefined. (Why?) For a given z
0 it is determined only
up to integer multiples of 27r since cosine and sine are periodic with period 27r. But one
often wants to specify a unique value of arg ::: of a given::: O. For this reason one defines
the principal value Arg::: (with capital A!) of arg ::: by the double inequality
*"
*"
-7r < Arg
(5)
z~
7r.
Then we have Arg z = 0 for positive real.:: = x, which is practical, and Arg z = 7r (not
-7r!) for negative real :::, e.g., for z = -4. The principal value (5) will be important in
connection with roots, the complex logarithm (Sec. 13.7), and certain integrals. Obviously,
for a given z 0 the other values of arg ::: are arg::: = Arg::: ± 21l7r (11 = ± I. ±2... ').
*"
Imaginary
axis
Y
p
Y -----------. z
.
=X + 'Y
I
I
I
06"'---'---------!~c--- :~~I
Fig. 320. Complex plane, polar form
of a complex number
x
Fig. 321. Distance between two
points in the complex plane
CHAP. 13
608
E X AMP L E 1
y
Complex Numbers and Functions
Polar Form of Complex Numbers. Principal Value Arg z
z= I
+ ; (Fig. 322) has the polar form z = V2 (COS!7T + i sin !7T). Hence we obtain
arg::: =!7T:!:
1+i
Similarly.
z = 3 + 3V3i
=
21l7T(1l
=
6 (cos ~7T
D, I."
.),
+ i sin ~7T).
and
Izl
=
Arg::: =!7T (the principal value).
•
6. and Arg::: = ~7T.
lfl4
CArTION! [n using (4), we must pay attention to the quadrant in which::: lies, since
tan 6 has period 7r, so that the arguments of z and -z have the same tangent. Example:
Example 1 for 6 1 = arg (1 + i) and 62 = arg (-] - i) we have tan 61 = tan 62 = 1.
x
g. 322.
Triangle Inequality
Inequalities such as Xl < X2 make sense for real numbers, but not in complex because
there is 110 lIatural WllY of ordering complex 11 umbers. However, inequalities between
absolute values (which are real!), such as IZII < 1:: 21 (meaning that ZI is closer to the origin
than Z2) are of great importance. The daily bread of the complex analyst is the triangle
inequality
(6)
(Fig. 323)
which we shall use quite frequently. This inequality follows by noting that the three points
0, .(;1' and;::1 + ':2 are the vertices of a triangle (Fig. 323) with sides 1z.1, 1.:21, and 1;::1 + 221.
and one side cannot exceed the sum of the other two sides. A formal proof is left to the
reader (Prob. 35). (The triangle degenerates if:::l and :::2 lie on the same straight line through
the origin.)
Y
I -,P"
~"."
~
x
Fig. 323. Triangle inequality
By induction we obtain from (6) the generalized triangle inequality
7 +
1-1
(6*)
~
~2
+ ... + -n
7 I :so;
-
that is. the absolute value of a
terms.
A
.. 2
SUIIl
Iz1 I +
17-2 I + ... +
Izn I',
callnot exceed the sum of the absolute vailies of The
Triangle Inequality
If:::l = I
+ ; and :::2
=
-2
1:::1 +
+ 3;. then (sketch a figure!)
:::21
=
I-I
+ 4il
=
\'17 =
4.123 <
\'2 +
Vi]
=
5.020.
Multiplication and Division in Polar Form
This will give us a "geometrical"' understanding of multiplication and division. Let
and
•
SEC. 13.2
609
Polar Form of Complex Numbers. Powers and Roots
Multiplication.
By (3) in Sec. 13.1 the product is at first
The addition rules for the sine and cosine [(6) in App. A3.1] now yield
(7)
Taking absolute values on both sides of (7), we see that the absolute value of a product
equals the product of the absolute values of the factors,
(8)
Taking arguments in (7) shows that the argument of a product equals the sum of the
arguments of the factors,
(9)
(up to multiples of 27T).
Division.
by 1'<:21
We have ~l
=
Hence
(ZlIz2)z2.
IZ11 = I(zI 1z2)z21 = IZ11z211z21
and by division
(10)
Zl
arg -
(11)
= arg Z1
-
arg
(up to multiples of 27T).
Z2
22
Combining (10) and
(II)
we also have the analog of (7),
(12)
To comprehend this formula. note that it is the polar form of a complex number of absolute
value r1/r2 and argument (it - 82 . But these are the absolute value and argument of zl lz2 ,
as we can see from (10). (II), and the polar forms of Zl and Z2.
E X AMP L E 3
Illustration of Formulas (8)-(11)
Let Zl
= -2 + 2; and::2 = 3i. Then
~IZ2
= -6 - 6i, zl fz2 = 213 + (213);. Hence (make a sketch)
and for the argumems we obtain Arg::1 = 3m4, Arg;:2 = 7[12,
37[
Arg (::1::2)
= - 4 = Arg;:1 +
Arg::2 - 27[,
Arg (::/<:2) =
;
= Arg Z1
-
Arg ;:2·
•
610
E X AMP L E 4
CHAP. 13
Complex Numbers and Functions
Integer Powers of z. De Moivre's Formula
From (8) and (9) with
~1 = ~2 = Z
we obtain by induction for
(13)
Z,n
Similarly. (l~) with:: 1 = I and::2
De Moivre's formula3
rn (COS
= : " gives
(cos
(13*)
=
e+ i
ne + i
O. 1,2....
11 =
sin I/e).
(I3)for 11 = - I, -2..... For 1::1 = r = I, tormula (13) becomes
sin
e)'" =
cos
I/e +
i sin nfl.
We can use this to express cos 118 and sin lI8 in terms of powers of cos 8 and sin 8. For instance, for II = 2 we
have on the left cos2 0 + 2; cos 0 sin 0 - sin2 O. Taking the real and imaginary parts on both sides of (13"')
with /I = 2 gives the familiar formulas
cos 28 = cos2 0 - sin 2 8,
sin 20 = 2 cos 0 sin O.
This shows that complex methods often simplify the derivation of real formulas. Try
/I
=
3.
•
Roots
If ;: = w" (n = 1. 2, .. '). then to each value of w there corresponds olle value of ;:. We
shall immediately see that, conversely, to a given z =1= 0 there correspond precisely 11
distinct values of w. Each of these values is called an nth root of ;:, and we write
(14)
=
W
~nf
V
z.
Hence this symbol is l1lultivalued, namely, n-va/ued. The 11 values of ~ can be obtained
as follows. We write;: and w in polar form
z=
r(cos
Then the equation w"
e + i sin tJ)
and
= z becomes.
w" = R"(cos Ilc/J
w
=
R(cos
c/J + i sin c/J).
by De Moivre's formula (with
+i
sin
11c/J) = .: =
r(cos
e+
c/J instead of e)
i sin
e).
The absolute values on both sides must be equal: thus. R n = r. so that R = Vr , where
~"f
v r is positive real (an absolute value must be nonnegative!) and thus uniquely determined.
Equating the arguments 11c/J and e and recalling that e is determined only up to integer
multiples of 21T, we obtain
11c/J =
e + 2k1T,
thus
c/J=
e
11
+
2k1T
n
where k is an integer. For k = O. I, .... n - I we get 11 distinct values of w. Further
integers of k would give values already obtained. For instance, k = n gives 2k7r1n = 271",
3 ABRAHAM DE MOIVRE (1667-1754), French mathematician. who pioneered the use of complex numbers
in trigonometry and also contributed to probability theory (see Sec. 24.8).
SEC. 13.2
611
Polar Form of Complex Numbers. Powers and Roots
hence the w corresponding to k
values
= 0, etc. Consequently,
~nl
~nl (
v z = v r cos
(15)
Vz, for z *- 0, has the 11 distinct
+ 2k7r + 1.Sin
' () + 2k7r)
----
()
11
11
~nl
where k = 0, I, ... , 11 - 1. These n values lie on a circle of radius v r with center at
the origin and constitute the vertices of a regular polygon of 11 sides. The value of Vz
obtained by taking the principal value of arg z and k = 0 in (15) is called the principal
~"I
value of w = v .:: .
Taking.:: = I in (15), we have
~nr.
v 1
(16)
Izl
=
r
= I and Arg::: = O. Then
2br
n
2br
n
= cos - - + i sin - - ,
(15) gives
k = 0, 1, .... n - 1.
These 11 values are called the nth roots of unity. They lie on the circle of radius I and
center 0, briefly called the unit circle (and used quite frequently!). Figures 324-326 show
~3r.1
....V11 -- +- I • +'
v I -- I , _12 -+ 1~
2 V r;;3
.:j I . ,
_I, an d ~5r.l
VI.
If w denotes the value corresponding to k
written as
= I in (6). then the 11 value" of VI can be
More generally, if WI is any nth root of an arbitrary complex number
II values of Vz in (15) are
z (*- 0), then the
(17)
because multiplying ~~'! by w k corresponds to increasing the argument of WI by 2k7r/n.
Formula (17) motivates the introduction of roots of unity and shows their usefulness.
y
y
y
OJ
OJ
OJ
x
Fig. 324.
11-81
Vl
POLAR FORM
Do these problems very carefully since polar forms will be
needed frequently. Represent in polar form and graph in
the complex plane as in Fig. 322 on p. 608. (Show the
details of your work.)
x
:r
Fig. 325.
~l
Fig. 326.
L 3 - 3i
2. 2i. -2;
3. -5
4. ~
5.
1 +
1 - ;
6.
+
"\ll
~1Ti
3V2 + 2i
-VI - (2/3);
CHAP. 13
612
7.
+ 5;
-6
+
+
2
8. 5
3i
[9-151
Complex Numbers and Functions
where sign y = I if y ~ 0, sign y = - I if y < 0,
and all square roots of positive numbers are taken
with positive sign. Hint: Use (10) in App. A3.1 with
x = 012.
3;
4i
PRINCIPAL ARGUMENT
(e) Find the square roots of 4;, 16 - 30i, and
+ 8 v7 i by both (18) and (19) and comment on the
work involved.
Determine the principal value of the argument.
9. - I - i
10. - 20 + ;, - 20 - ;
12. -7T 2
14. (l +
11. 4 ::':: 3;
13. 7 ::':: 7;
IS. (9 + 9;)3
9
(d) Do some further examples of your own and apply
a method of checking your results.
i)12
127-301
116-20 I
EQUATIONS
CONVERSION TO X + iy
Represent in the form x + iy and graph it in the complex
Solve and graph all solutions, showing the details:
27. ::2 - (8 - 5i)::; + 40 - 20; = 0 (Use (19).)
plane.
28. ::4
+
29.
30.
+
16. COS!7T + ; sin (::'::!7T)
18. 4(COS!7T ::':: ; sin !7T)
20. 12(cos ~7T + ; sin ~7T)
121-251
17. 3(cos 0.2 + ; sin 0.2)
19. cos (-I) + ; sin (-I)
ROOTS
23. ~
2S.~
24. ~ 3 + 4;
26. TEAM PROJECT. Square Root. (a) Show that
w = ~ has the values
=
Vi-
(36 - 6i)z
+
42 -
+
Wi) = 0
I Ii
=
0
16 = O. Then use the solutions to factor Z4
into quadratic factors with real coefficients.
Z4
+
16
31. CAS PROJECT. Roots of Unity and Their Graphs.
Find and graph all roots in the complex plane.
21. V-i
22. {Y]
}\'1
8::;2 -
(5 - 14i)::2 - (24
[cos
~
Write a program for calculating these roots and for
graphing them as poims on the unit circle. Apply the
program to z n = 1 with n = 2, 3. . . . , 10. Then extend
the program to one for arbitrary roots. using an idea
near the end of the text, and apply the program to
examples of your choice.
132-351
+ ; sin
~]
INEQUALITIES AND AN EQUATION
Verify or prove as indicated.
'
32. (Re and 1m) Prove IRe zl ~ Izl, lIm zl ~ Izl·
33. (parallelogram equality) Prove
1::1 + 2212 + 1.:::1 - ::;21 2 = 2(h1 2 + IZ212).
Explain the name.
(b) Obtain from (8) the often more practical formula
(19)
V~
13.3
=
::,::[v'~ (1.;:1
+x) +
(signy)iv'~ (izl +x)j
34. (Triangle inequality) Verify (6) for
::2 = 5 + 1;.
35. (Triangle inequality) Prove (6).
ZI =
4
+
7i.
Derivative. Analytic Function
Our study of complex functions will involve point sets in the complex plane. Most
important will be the following ones.
Circles and Disks. Half-Planes
The unit circle Izl = 1 (Fig. 327) has already occurred in Sec. 13.2. Figure 328 shows a
general circle of radius p and center a. Its equation is
Iz -
al =
p
SEC. 13.3
613
Derivative. Analytic Function
y
y
y
1 x
Fig. 127.
Unit circle
a
x
x
Fig. 128. Circle in the
complex plane
Fig. 129. Annulus in the
complex plane
because it is the set of all : whose distance Iz - al from the center 1I equals p. Accordingly,
its interior ("open circular disk") is given by Iz - al < p, its interior plus the circle itself
("closed circular disk") by Iz - al ~ p, and its exterior by Iz - al > p. As an example,
sketch this for a = 1 + i and P = 2, to make sure that you understand these inequalities.
An open circular disk Iz - 1I1 < P is also called a neighborhood of aor, more precisely,
a p-neighborhood of 1I. And 1I has infinitely many of them. one fur each value of
P (> 0), and a is a point of each of them, by definition!
In modem literature any set containing a p-neighborhood of a is also called a
neighborhood of a.
Figure 329 shows an open annulus (circular ring) PI < Iz - al < P2, which we shall
need later. This is the set of all z whose distance Iz - al from 1I is greater than PI but less
than P2. Similarly, the closed annulus PI ~ Iz - al ~ P2 includes the two circles.
Half-Planes. By the (open) upper half-plane we mean the set of all points: = x + iy
such that y > O. Similarly, the condition y < 0 defines the lower half-plane, x > 0 the
right half-plane, and x < 0 the left half-plane.
For Reference: Concepts on Sets in the
Complex Plane
To Our discussion of special sets let us add some general concepts related to sets that we
shall need throughout Chaps. 13-18: keep in mind that you can find them here.
By a point set in the complex plane we mean any sort of collection of finitely many
or infinitely many points. Examples are the solutions of a quadratic equation, the points
of a line, the points in the interior of a circle as well as the sets discussed just before.
A set S is called open if every point of S has a neighborhood consisting entirely of
points that belong to S. For example, the points in the interior of a circle or a square form
an open set, and so do the points of the right half-plane Re z = x > O.
A set S is called connected if any two of its points can be joined by a broken line of
finitely many straight-line segments all of whose points belong to S. An open and connected
set is called a domain. Thus an open disk and an open annulus are domains. An open
square with a diagonal removed is not a domain since this set is not connected. (Why?)
The complement of a set S in the complex plane is the set of all points of the complex
plane that do 1I0t belo1lg to S. A set S is called closed if its complement is open. For
example, the points on and inside the unit circle form a closed set ("closed unit disk")
since its complement Izl > I is open.
A boundary point of a set S is a point every neighborhood of which contains both
points that belong to S and points that do not belong to S. For example, the boundary
614
CHAP. 13
Complex Numbers and Functions
points of an annulus are the points on the two bounding circles. Clearly, if a set S is open.
then no boundary point belongs to S; if S is closed, then every boundary point belongs to
S. The set of all boundary points of a set S is called the boundary of S.
A region is a set consisting of a domain plus, perhaps, some or all of it'> boundary
points. WARNING! "Domain" is the modem term for an open connected set.
Nevertheless, some authors still call a domain a "region" and others make no distinction
between the two terms.
Complex Function
Complex analysis is concerned with complex functions that are differentiable in some
domain. Hence we should first say what we mean by a complex function and then define
the concepts of limit and derivative in complex. This discussion will be similar to that in
calculus. Nevertheless it needs great attention because it will show interesting basic
differences between real and complex calculus.
Recall from calculus that a real function f defined on a set S of real numbers (usually
an interval) is a rule that assigns to every x in S a real number f(x), called the value of
f at x. Now in complex, S is a set of complex numbers. And a function f defined on S is
a rule that assigns to every.::: in S a complex number lV, called the vallie of fat.:::. We write
w
=
f(.:::).
Here z varies in S and is called a complex variable. The set S is called the domain of
definition of f or, briefly, the domain of f. (In most cases S will be open and connected,
thus a domain as defined just before.)
Example: w = fez) = Z2 + 3.::: is a complex function defined for all z; that is, its domain
S is the whole complex plane.
The set of all values of a function f is called the range qf f.
w is complex, and we write w = u + iv, where u and v are the real and imaginary
parts, respectively. Now H' depends on .::: = x + iy. Hence u becomes a real function of x
and y. and so does v. We may thus write
w = fez) = u(x, y)
+ iv(x, y).
This shows thaI a complex function f(z) is equivalent to a pair of real functions u(x, v)
and vex, y), each depending on the two real variables x and y.
E X AMP L E 1
Function of a Complex Variable
Let w = 1(:) = ;;:2
Solutio1l.
/I
+ 3::. Find II and v and calculate the value of I at :: = I + 3i.
= Re 1(:::) = x 2
1(1 + 3i) =
.\"2
-
(I
+ 3~ and v = 2~y +
3y. Also.
+ 3i)2 + 30 + 3i) = I - 9 + 6i + 3 + 9i = - 5 +
15i.
This shows that 11(1. 3) = -5 and vO. 3) = 15. Check this by using the expressions for II and v.
E X AMP L E 2
•
Function of a Complex Variable
Let w = f(:;:)
= 2iz + 6z. Find u and v and the vallie of f
Solution.
1(;::)
=
2i(x
+
iy)
I(! + 4i) =
Check thIS
a~ III
Example I.
+
6(x - iy) gives Lt(x. y)
2i(~
at z
=
= ~ + 4i.
6x - 2)" and vex. y)
= 2.< - 6.\". Also,
+ 4i) + 6(! - 4i) = i - 8 + 3 - 24i = -5 - 23;.
•
SEC 13.3
615
Derivative. Analytic Function
Remarks on Notation and Terminology
1. Strictly speaking, fez) denotes the value of f at z, but it is a convenient abuse of
language to talk about the junction fez) (instead of the junction f), thereby exhibiting the
notation for the independent variable.
2. We assume all functions to be sillgle-valued relatiolls, as usual: to each.: in S there
corresponds but one value w = f(.:) (but. of course, several z. may give the same value
tv = fez), just as in calculus). Accordingly, we shall not lise the term "multi valued
function" (used in some books on complex analysis) for a multivalued relation. in which
to a.: there corresponds more than one w.
Limit, Continuity
A function f(;:.) is said to have the limit I as ;:. approaches a point
lim fez)
(1)
':0,
written
I,
=
z-Z'o
if f is defined in a neighborhood of ':0 (except perhaps at Zo itself) and if the values
of f are "close" to I for all z. "close" to Zo; in precise terms, if for every positive real E
we can find a positive real 0 such that for all z ':0 in the disk Iz - 201 < 0 (Fig. 330)
we have
*
If(z) -
(2)
II
<
E;
*
geometrically. if for every.:::
':0 in that 8-disk the value of f lies in the disk (2).
Formally, this definition is similar to that in calculus. but there is a big difference.
Whereas in the real case, x can approach an Xo only along the real line. here, by definition.
z may approach Zofrolll allY direction in the complex plane. This will be quite es~ential
in what follows.
If a limit exists, it is unique. (See Team Project 26.)
A function fez) is said to be continuous at
(3)
lim f(.:)
z=
=
':0
if f(.:o) is defined and
f(;:.o)·
Z-Zo
Note that by definition of a limit this implies that fez) is defined in some neighborhood
of ':0'
f(;:.) is said to be continuous in a domain if it is continuous at each point of this domain.
v
y
,,.---- ........,
" ---- "
--.1_
I
,
E~l
I
,.....------
,
--"-0
,
I
f(z)
,
I
,
x
I
,
Fig.
:no.
Limit
.... ' .... _---,,. "
"
U
616
CHAP. 13
Complex Numbers and Functions
Derivative
The derivative of a complex function
f at a point ':0 is written J' (~o) and is defined by
(4)
provided this limit exists. Then f is said to be differentiable at zoo If we write b.z
we have z = '::0 + .1.: and (4) takes the fonn
f' (zo)
(4')
':0,
fez) - f(zo)
lim
=
= :: -
Z -
2-20
20
Now comes an important point. Remember that, by the definition of limit. f(.::) is defined
in a neighborhood of Zo and z in (4') may approach Zo from any direction in the complex
plane Hence differentiability at '::0 means that. along whatever path.:: approaches ':0' the
quotient in (4') always approaches a certain value and all these values are equal. This is
important and should be kept in mind.
E X AMP L E 1
Differentiability. Derivative
The function
I(;:;)
=
~2 is differentiable for all.: and has the derivative
I'(.:)
= 2.: because
•
The differentiation rules are the same as in real calculus, since their proofs are literally
the same. Thus for any analytic functions f and g and constants c we have
(cf)'
= cJ',
(f
+ g)' = J' + g',
(fg)'
=
f'g
+
fg',
J'g - fg'
(;)' =
If
as well as the chain rule and the power rule (:::n)' = 11Zn - 1 (11 integer).
Also, if f(.::) is differentiable at zoo it is continuous at '::0' (See Team Project 26.)
E X AMP L E 4
i not Differentiable
It may come as a surprise that there are many complex functions that do not have a derivative at any point. For
instance. II.:) = ;: = f - iy is such a function. To ~ee this. we write .l:: = .l" + ;.ly and obtain
I(~
(5)
+ .l::)
- I(::)
(z
+ j.::)
Cl.x - iCl.y
- ;:
Cl.7
Cl.::
Cl.x
+ iCl.y
If .ly = O. thi, i, + I. If j.x = O. this is - I. Thu, (5) approaches + I along path I in Fig. 331 but -I along
path H. Hence. by definition. the limit of (5) as .l: -> 0 does not exist at any.:.
•
y
x
Fig. 331.
Paths in (5)
SEC. 13.3
617
Derivative. Analytic Function
Surprising as Example 4 may be. it merely illustrates that differentiability of a compler
function is a rather severe requirement.
The idea of proof (approach of z. from different directions) is basic and will be used
again as the crucial argument in the next section.
Analytic Functions
Complex analysis is concerned with the theory and application of "analytic functions,"
that is. functions that are differentiable in some domain. so that we can do "calculus in
complex." The definition is as follows.
DEFINITION
Analyticity
A function f(::.) is said to be allalytic ill a domaill D if f(~) is defined and
differentiable at all points of D. The function f(z.) is said to be analytic at a point
Z. = Zo in D if fez) is analytic in a neighborhood of zoo
Also, by an analytic function we mean a function that is analytic in some domain.
Hence analyticity of fez) at :0 means that fez) has a derivative at every point in some
neighborhood of Zo (including Zo itself since, by definition, Zo is a point of all its
neighborhoods). This concept is motivated by the fact that it is of no practical interest if
a function is differentiable merely at a single point ::'0 but not throughout some
neighborhood of zoo Team Project 26 gives an example.
A more modem term for analytic in D is bolomorphic in D.
E X AMP L E 5
Polynomials, Rational Functions
The nonnegative integer powers I,
that is, functions of the form
z, ::.2•••• are analytic
where Co• • • • • C n are complex constants.
The quotient of two polynomials g(::.) and
in the entire complex plane. and so are polynomials,
h(;;:),
g(::.)
I(:) =
he:) ,
is called a rational function. This I is analytic except at the points where /i(::;) = 0: here we assume that common
factors of .Ii and h have been canceled.
Many further analytic functions will be considered in the next sections and Chapters.
•
The concepts discussed in this section extend familiar concepts of calculus. Most important
is the concept of an analytic function, the exclusive concern of complex analysis. Although
many simple functions are not analytic, the large variety of remaining functions will yield
a most beautiful branch of mathematics that is very useful in engineering and physics.
11-101
CURVES AND REGIONS OF
PRACTICAL INTEREST
3. 0 <
Iz -
3 - 2il = ~
2. 1 ~
Iz -
< 1
5. 1m Z2 = 2
Find and sketch or graph the sets in the complex plane given
by
1.
Iz - 11
I
+
4il ~ 5
7.
Iz + 11
=
4. -7r<Re;:<7r
6.Rez>-I
Iz - 11
9. Re z 21m.::
8. IArg
zl
10. Re (1/:)
~ ~7r
< 1
618
CHAP. 13
Complex Numbers and Functions
11. WRITING PROJECT. Sets in the Complex Plane.
Extend the part of the text on sets in the complex plane
by fonnulating that part in your own words and
including examples of your own and comparing with
calculus when applicable.
25. CAS PROJECT. Graphing Functions. Find and
graph Re f. 1m f. and IfI as surfaces over the ::-plane.
Also graph the two families of curves Re Ie::) = COllSt
and 1m if:::) = COllst in the same figure, and the curves
If(zli = COIlS! in anoth€r figure, where (a) fez) = ::2,
(b) I(z) = liz, (c) fez) = Z4.
COMPLEX FUNCTIONS AND DERIVATIVES
26. TEAM PROJECT. Limit, Continuity, Derivative
(a) Limit. Prove that (I) is equivalent to the pair of
relations
112-151 Function Values. Find Re I and 1m f. Also find
their values at the given point :::.
f = 3::: 2 - 6::: + 3i, z = 2 +
f
.:::/(z + I), z = 4 - 5i
14. f
1/( I - :::), ::: = l + !i
15. f
1/:::2, ::: = I + ;
12.
13.
lim Re i(z) = Re t,
(b) Limit. If lim I(:::) exists, show that this limit is
z-zo
unique.
(e) Continuity. If:::}o ::2' ... are complex numbers for
which lim ::" = a, and if i(:) is continuous at
116-191 Continuity. Find out (and give reason) whether
.f(z) is continuous at ::: = 0 if I(O) = 0 and for z =1= 0 the
function I is equal to:
17. [1m (::2)]/1z1
16. [Re (::2)]/ld 2
19. (1m ::)/(1
18. 1z12 Re (1/::)
1:::1)
120-241 Derivative. Differentiate
20. (.:::2 - 9)/(:::2 + I)
21. (:3
22. (3:: + 4i)/( 1.5;: - 2)
24. ::2/(: + ;)2
13.4
lim 1m Ie::) = 1m l.
Z-Zo
2-----;"2'0
'it_CO
z = a, show that lim i(::n) = i(a).
n-----'""x
(d) Continuity. If if:::) is differentiable at :::0' show that
if:::) is continuous at :::0'
(e) Differentiability. Show that if::) = Re z = x is
not differentiable at any z. Can you find other such
functions?
+ ;)2
23. i/(l - ;::)2
Differentiability. Show that if::) = 1:::12 is
differentiable only at:: = 0; hence it is nowhere analytic.
(l)
Cauchy-Riemann Equations.
Laplace's Equation
Tlte Cauchy-Riemall1l equatiolls are tile most importallt equatiolls ill tltis chapter and
one of the pillars on which complex analysis rests. They provide a criterion (a test) for
the analyticity of a complex function
w
=
fez)
=
u(x, y)
+
iv(x, y).
Roughly, f is analytic in a domain D if and only if the first partial derivatives of u and
v satisfy the two Cauchy-Riemann equations4
(1)
4 The French mathematician AUGUSTIN-LOUIS CAUCHY (see Sec. 2.5) and the German mathematicians
BERNHARD RIEMANN (l1l26-Hl66) and KARL WEIERSTRASS (1815 ·1897: see also Sec. 15.5) are the
founders of complex analysis. Riemann received his Ph.D. (in 1851) under Gauss (Sec. 5.4) at Gilttingen. where
he also taught until he died, when he was only 39 years old. He introduced the concept of the integral as it is
used in basic calculus courses. and made important contributions to differential equations. number theory. and
mathematical physics. He also developed the s(}-called Riemannian geometry. which is the mathematical
foundation of Einstein's theory of relativity; see Ref. [GR9] in App. I.
SEC. 13.4
Cauchy-Riemann Equations. Laplace's Equation
619
everywhere in D; here U x = alliax and u y = aulay (and ~imilarly for v) are the usual
notations for partial derivatives. The precise formulation of this statement is given in
Theorems I and 2.
Example: fez) = ;:,2 = x 2 - ."2 + 2ixy is analytic for all:: (see Example 3 in Sec. 13.3),
and II = x 2 - ."2 and v = 2xy satisfy (1), namely, U x = 2x = Vy as well as lIy = -2y = -v x .
More examples will follow.
Cauchy-Riemann Equations
THEOREM 1
Let fez) = lI(X, y) + iv(x, y) be defined and continuous in some neighborhood of a
point :: = x + iy and d(fferentiable at :: itself. Then at that point, the first-order
partial derimtil'es of u and v exist and satisfy the Cauchy-Riemann equations (I).
Hence if ft::) is analytic ill a domain D, those partial deriI'Gtil'es exist and satisfr
(l) at all points of D.
PROOF
By a~~umption. the derivative
f' (.:) at .: exists. It is given by
f' (z)
(2)
=
lim
fez
+
!>z~O
ilz) - fez)
ilz
The idea of the proof is very simple. By the definition of a limit in complex (Sec. 13.3)
we can let S~: approach zero along any path in a neighborhood of ;:.. Thus we may choose
the two paths I and II in Fig. 332 and equate the results. By comparing the real parts we
shall obtain the fir.;t Cauchy-Riemann equation and by comparing the imaginary parts the
second. The technical details are as follows.
We write ..k = ~x + i:1y. Then.: + .1.: = x + :1x + iCy + :1.\"), and in terms of /I and
v the derivative in (2) becomes
(3)
f' (;:.) = lim
[lI(x
+
ilx, y
+
ily)
+
iv(x
..lz~O
+
ilx, )'
.!lx
+
+
ily)] - [II(X, .1')
+
iv(x, y)]
i.!ly
We first choose path I in Fig. 332. Thus we let ily ~ 0 first and then ilx ~ O. After ily
is zero, il:: = ilx. Then (3) becomes. if we first write the two u-tenns and then the two
v-terms,
f'(.:) = lim
..lx~O
lI(X
+
.!lx, .r) - lI(X, .r)
.1.\
+i
lim
.l.x~O
y
x
Fig. 332.
Paths in (:2)
vex
+
.lx, r) - vex, r)
.
.
6..\
620
CHAP. 13
Complex Numbers and Functions
Since f' (z) exists, the two real limits on the right exist. By definition, they are the partial
derivatives of u and v with respect to x. Hence the derivative f' (z) of fez) can be written
(4)
Similarly, if we choose path II in Fig. 332. we let ~x ~ 0 first and then
is zero, ~:: = i:1y, so that from (3) we now obtain
~y ~
O. After
~x
f' (::) = lim
..ly~O
II(X, \'
.
+ .1,') ..
I .1y
u(x, ,.)
-
+
i lim
.ly~O
vex, "
+ .1 \') -
_.
i.ly
vex, ")
-
Since f' (.:) exists, the limits on the right exist and give the partial derivatives of u and v
with respect to y; noting that 1Ii = -i, we thus obtain
j'(z) = -illy
(5)
+
Vy .
The existence of the derivative f' (z) thus implies the existence of the four partial
derivatives in (4) and (5). By equating the real parts liT and Vy in (4) and (5) we obtain
the first Cauchy-Riemann equation (1). Equating the imaginary parts gives the other. This
proves the first statement of the theorem and implies the second because of the definition
of analyticity.
•
FOlmulas (4) and (5) are also quite practical for calculating derivatives
see.
E X AMP L E 1
f' (z), as we shall
Cauchy-Riemann Equations
J(~) = ::2 is analytic for all ~. It follow, that the Cauchy-Riemann equation, mu,t be ,atisfied (as we have
verified abuve).
For f(::) = :: = x - iy we have /I = X, V = -.1' and see that the second Cauchy-Riemann equation is satisfied.
/l y = -v x = O. but the tlrst is not: "x = I
Vy = -1. We conclude that f(::) = :: is not analytic. confirming
Example 4 of Sec. 13.3. Note the savings in calculation!
•
*
The Cauchy-Riemann equations are fundamental because they are not only necessary
but also sufficient for a function to be analytic. More precisely, the following theorem
holds.
THEOREM 2
Cauchy-Riemann Equations
If two real-valued continllolls functions lI(X. y) and vex. y) of two real variables x
and y have COlltillUOUS first partial derivatives that satisfy the Cauchy-Riemll1ln
equlItions in some domain D, then the complex jilllctioll fez) = lI(X, y) + iv(x, y) is
allalytic ill D.
The proof is more involved than that of Theorem 1 and we leave it optionallsee App. 4).
Theorems I and 2 are of great practical importance, since by using the
Cauchy-Riemann equations we can now easily find out whether or not a given complex
function is analytic.
SEC. 13.4
Cauchy-Riemann Equations. Laplace's Equation
E X AMP L E 2
621
Cauchy-Riemann Equations. Exponential Function
Is i(:::)
= II(X. y) +
Solution.
iv(x, y)
= eX(cos y + i
sin y) analytic?
We have II = eX cos y, v = eX sin
ltx = eX
lIy
=
y and by differentiation
vy
cosy.
sin y.
-ex
= eX
Vx =
e
x
cos .\"
.
smy.
We see that the Cauchy-Riemann equations are satisfied and conclude that
be the complex analog of eX known from calculus.)
I(~)
is analytic for all
~. (f(~)
will
•
E X AMP L ElAn Analytic Function of Constant Absolute Value Is Constant
The Cauchy-Riemann equations also help in deriving geneml properties of analytic functions.
For instance. show that if I(~) is analytic in a domain D and II(::) I = k = CO/1St in D. then I(:) =
D. (We shall make crucial use of thb in Sec. 18.6 in the proof of Theorem 3.1
Solutioll.
Now use
By assumption.
Vx = -lly
IJI2 = lu + ivl2
in the first equation and
= I?
+ v2
in
2
= k . By differentiation,
IIllX
+ vVx
=
lllly
+ VVy
=
Vy = llx
COIlst
o.
o.
in the second. to gel
(a)
llllx -
(b)
lilly
Vlly =
0,
(6)
To get rid of lly. multiply (6a) by
(6b I by II and add. l1lis yields
II
and (6b) by
+ Vllx
= O.
v
and add. Similarly. to eliminate
2
+
V
2
+
2
V )lIy =
(11
(11
2
llx.
multiply (6a) by
-v
and
)lIx = O.
O.
*
If k 2 = ll2 + v 2 = O. then II = v = 0; hence I = O. If k 2 = ll2 + v 2
O. then IIx = lIy = O. Hence. by
the Cauchy-Riemann equations. also Vx = Vy = O. Together this implies II = COllst and v = canst; hence
I
= canst.
•
We mention that if we use the polar fom1 z = r(cos 6 + i sin 6) and set
fez) = u(r, 6) + iv(r, 6), then the Cauchy-Riemann equations are (Prob. 11)
LIT
=
(7)
vT
=
r
v e,
(r> 0).
r
LI/I
Laplace's Equation. Harmonic Functions
The great importance of complex analysis in engineering mathematics results mainly from
the fact that both the real part and the imaginary part of an analytic function satisfy
Laplace's equation, the most important PDE of physics. which Occurs in gravitation,
electrostatics, fluid flow, heat conduction, and so on (see Chaps. 12 and 18).
CHAP. 13
622
THEOREM 3
Complex Numbers and Functions
Laplace's Equation
If fez) = u(x, y) + iv(x, y) is lInalytic in
Laplace's equation
II
d0111l1in D. then both
II
and v
sati.~f\'
(8)
(V2 read "nabla squared") and
(9)
in D and h(lI'e continuous second partial derivatives in D.
PROOF
Differentiating
Ux
=
Vy
with respect to
x
and
=
uy
-vx
with respect to y, we have
(10)
Now the derivative of an analytic function is itself analytic. as we shall prove later (in
Sec. 14.4). This implies that u and v have continuous partial derivatives of all orders: in
particular, the mixed second derivatives are equal: vYT = v XY ' By adding (10) we thus
obtain (8). Similarly, (9) is obtained by differentiating Ux = Vy with respect to y and
lty = -v x with respect to x and subtracting, using uxy = uyx '
•
Solutions of Laplace's equation having conti1luous second-order partial derivatives
are called harmonic functions and their theory is calIed potential theory (see also
Sec. 12.10). Hence the real and imaginary parts of an analytic function are harmonic
functions.
If two harmonic functions u and v satisfy the Cauchy-Riemann equations in a domain
D, they are the real and imaginary parts of an analytic function f in D. Then v is said to
be a harmonic conjugate function of u in D. (Of course, this has absolutely nothing to
do with the use of "conjugate" for z.)
E X AMP L E 4
How to Find a Harmonic Conjugate Function by the Cauchy-Riemann Equations
Verify that
v of 1/.
1/ =
x2
-
\,2 -
Y is harmonic in the whole complex plane and find a harmonic conjugate function
Solution. ,21/ = 0 by direct calculation. Now lIx = 2x and
Cauchy-Riemann equations a conjugate v of 1/ must satisfy
vx
2x,
Vy = lIx =
lIy
~ -1/ y ~
= - 2.1' -
I. Hence because of the
2,·
_ + 1.
Integrating the first equation with respect to )' and differentiating the result with respect to .t. we obtain
v =
2.\)'
+ h(x).
dh
2y + dx .
Vx =
A comparison with the second equation shows that dh/dr: = 1. This gives hex) = x + c. Hence v = 2.\)' + X + c
(c any real constant) is the most general hannonic conjugate of the given II. The conesponding analytic function is
I(::.)
= II +
iv ~ x
2
-
)'2 -
)'
+
;(2.\)'
+
X
+
c)
= ~2 + ;: +
;e.
•
SEC 13.5
Exponential Function
623
Example 4 illustrates that a conjugate of a given harmonic function is uniquelv determilled
up to an arbitrary real additive constant.
The Cauchy-Riemann equations are the most important equations in this chapter. Their
relation to Laplace's equation opens wide ranges of engineering and physical applications,
as we shall show in Chap. 18 .
. ........
~
CAUCHY-RIEMANN EQUATIONS
22.
U
=
e 3:];
co~
ay
23. u = sin x cosh cy
Are the following functions analytic? [Use (1) or (7).]
1. f(:;.)
=
2
2. f(::.) = 1m
:;.4
3.
e x(cos y
5.
e-X(cos
+ i sin y)
y - i sin y)
7. f(z) = Re z + 1m z
9. f(:;.) = i/::. 8
(:;.2)
4. f(:;.) = I/O -
6. fez) = Arg
10. f(:;.) =
7TZ.
Izl
8. f(z.) = In
::.2
:;.4)
+
+ i Arg z
I/:;.2
11. (Cauchy-Riemann equations in polar form) Derive
(7) from (1).
112-21/
HARMONIC FUNCTIONS
f(:;.) = u (x, y)
+
iv(x, y).
13. v = xy
- yl(x 2
14. v
16. v = In Izl
18. Lt = I/(x 2 +
20.
Lt
=
+
y2)
)'2)
15. u = In Izl
17. II = x 3 - 3xy2
19. U = (x 2 _ )'2)2
21.
cos x cosh y
122-241
l/
= e- x
sin 2)'
Determine a, b, C such that the given functions
are harmonic and find a harmonic conjugate.
13.5
26. TEAM PROJECT. Conditions for fez) = COllst. Let
f(:;.) be analytic. Prove that each of the following
conditions is sutIicient for f(:;.) = COllst.
(a) Re fez) = comt
(b) [m f(:;.) =
(c)
Are the following functions harmonic? If your answer is
yes, find a corresponding analytic function
12. u = x)'
25. (Harmonic conjugate) Show that if II is harmonic and
v is a harmonic conjugate of II, then II is a harmonic
conjugate of -v.
f' (z)
=
COIUT
0
(d) If(z)1 = COllst (see Example 3)
27. (Two further formulas for the derivative). Formulas
(4). (5), and (J I) (below) are needed from time to time.
Derive
(II)
J'(;:;) =
Ux -
illy,
f' (z)
= Vy
+
iv x '
28. CAS PROJECT. Equipotential Lines. Write a
program for graphing equipotential lines II = comt of
a harmonic function II and of its conjugate v on the
same axes. Apply the program to (a) II = x 2 - )'2,
U = 2xy, (b) u = x 3 - 3xy2, U = 3x 2y _ y3,
(c) U = eX cos )', v = eX sin y.
Exponential Function
In the remaining sections of this chapter we discuss the basic elementary complex
functions, the exponential function, trigonometric functions. logarithm, and so on. They
will be counterparts to the familiar functions of calculus, to which they reduce when
z = x is real. They are indispensable throughout applications, and some of them have
interesting properties not shared by their real counterparts.
We begin with one of the most important analytic functions, the complex exponential
function
also written
exp Z.
The definition of e Z in terms of the real functions eX, cos y, and sin y is
(1)
624
CHAP. 13
Complex Numbers and Functions
This definition is motivated by the fact the eZ extends the real exponential function eX of
calculus in a natural fashion. Namely;
(A) eZ = eX for real z = x because cos Y = 1 and sin y = 0 when y =
o.
(B) eZ is analytic for all z. (Proved in Example 2 of Sec. 13.4.)
(e) The derivative of eZ is eZ • that is.
(2)
This follows from (4) in Sec. 13.4.
REMARK. This defInition provides for a relatively simple discussion. We could defme eZ by
the familiar series I + x + x2/2! + x 3 /3! + ... with x replaced by Z, but we would then have
to discuss complex series at this very early stage. (We will show the connection in Sec. 15.4.)
Further Properties. A function I(::) that is analytic for all :: is called an entire function.
Thus, eZ is entire. Just as in calculus the fUllctional relation
(3)
holds for any 21
=
+
Xl
iYl
and
Z2
=
X2
+ iYz. Indeed, by
(1),
Since e e = e + for these real functions, by an application of the addition fonnulas
for the cosine and sine functions (similar to that in Sec. 13.2) we see that
X1
X2
X1
X2
as asserted. An interesting special case of (3) is
Zl
=
X, Z2
= iy; then
(4)
Furthennore, for
Z
= iy we have from
(5)
e
iy
=
(1) the so-called Euler formula
cosy
+ i siny.
Hence the polar form of a complex number, ;::
(6)
From (5) we obtain
(7)
as well as the important formulas (verify!)
(8)
e 7Ti
= -1,
=
r(cos
e + i sin 0). may now be written
SEC. 13.5
Exponential Function
625
Another consequence of (5) is
leiYI =
(9)
leos y
+ i sin yl =
Vcos
2
y
+
sin2 y
=
1.
That is, for pure imaginary exponents the exponential function has absolute value I, a
result you should remember. From (9) and (1),
(10)
argeZ = y ± 2nn (n = 0, 1,2," .),
Hence
since !ezi = eX shows that (1) is actually ~ in polar form.
From lezi = eX *- 0 in (0) we see that
(11)
for all z.
So here we have an entire function that never vanishes, in contrast to (nonconstant)
polynomials, which are also entire (Example 5 in Sec. 13.3) but always have a zero, as
is proved in algebra.
Periodicity of e Z with period 27Ti,
(12)
for all
z
is a basic property that follows from (1) and the periodicity of cos y and sin y. Hence all
the values that w = e Z can assume are already assumed in the horizontal strip of width
27T
(13)
(Fig. 333).
-n<Y~7T
This infinite strip is called a fundamental region of eZ •
E X AMP L E 1
Function Values. Solution of Equations.
Computation of values from (I) provides no problem. For instance. verify that
4
e1. -
O.6i
=
e1.
4
tcos 0.6 - i sin 0.6)
=
4.055(0.8253 - 0.5646i) = 3.347 - 2.289;
Arg e1.4 - 0 .6i = -0.6.
To illustrate (3), take the product of
e2 + i = e2 (cos 1
+i
sin I)
and
y
x
-Tr:
Fig. 333. Fundamental region of the
exponential function e in the z-plane
Z
CHAP. 13
626
Complex Numbers and Functions
To solve the equation eZ = 3
solutions. Now, since eX = 5,
eX cosy
= 3.
+
4i, note first that lezi = eX = 5.
eX sin v = 4.
cosy = 0.6.
X
= In 5
=
1.609 is the real part of all
siny = 0.8.
y = 0.927.
Am. :: = 1.609 + 0.927i::': 211'11'; (n = O. 1.2, ... ). The~e are infinitely many solutions (due to the periodicity
of eZ ). They lie on the vertical line x = 1.609 at a distance 27i" from their neighbors.
•
To summarize: many properties of eZ = exp z parallel those of eX; an exception is the
periodicity of £f with 2ni, which suggested the concept of a fundamental region. Keep in
mind that ~ is an entire function. (Do you still remember what that means?)
1. Using the Cauchy-Riemann equations, show that e Z is
entire.
12-81
Values of eZ • Compute eZ in the form u + iv and
lezl, where ~ equals:
3. I + 2i
2. 3 + 71'i
4. Vz - !71'i
5. 771'il2
7. 0.8 - 5i
6. (l + i)71'
8. 971'i/2
19-12 1
Real and Imaginary Parts. Find Re and 1m of:
9. e- 2z
11. e
113-171
13.
15.
10. e
z3
z2
Vi
Polar Form. Write in polar form:
14. 1 +
V;
17. -9
13.6
16. 3
+ 4i
118-21\
Equations. Find all solutions and graph some of
them in the complex plane.
18. e 3 • = 4
19. e Z = -2
Z
21. e Z = 4 - 3i
20. e = 0
22. TEAM PROJECT. Further Properties of the
Exponential Function. (a) Analyticity. Show that c
is entire. What about el/z? e Z? eX(cos ky + i sin ky)'?
(Use the Cauchy-Riemann equations.)
(b) Special values. Find all ;;: such that (i) e Z is real.
(ii) le-zi < 1, (iii) e Z = 'if.
(c) Harmonic function. Sho~- that
u = e XY cos (x 2 /2 - )'2/2) is harmonic and find a
conjugate.
(d) Uniqueness. [t is interesting that f(z) = e Z is
uniquely determined by the two properties
f(x + iO) = eXand!'(;;:) = f(z).wherefisassumed
to be entire. Prove this using the Cauchy-Riemann
equations.
Trigonometric and Hyperbolic Functions
Just as we extended the real eX to the complex eZ in Sec. 13.5. we now want to extend
the familiar real trigonometric functions to complex trigonometric flillctiollS. We can do
this by the use of the Euler formulas (Sec. 13.5)
eix = cos x + i sin x,
e- ix = cosx - i sinx.
By addition and subtraction we obtain for the real cosine and sine
This suggest,> the following definitions for complex values
z = x + iy:
SEC 13.6
627
Trigonometric and Hyperbolic Functions
(1)
slnz
=
It is quite remarkable that here in complex. functions come together that are unrelated in
real. This is not an isolated incident but is typical of the general situation and shows the
advantage of working in complex.
Furthermore, as in calculus we define
(2)
tanz =
sin z
cos
cot;::
z
z
sin z
cos
=
and
(3)
sec
z=
I
cos
csc
z
z=
sin z
Since eZ is entire, cos z and sin z are entire functions. tan z and sec z are not entire; they
are analytic except at the points where cos;:. is zero; and cot z and csc z are analytic except
where sin z is zero. Formulas for the derivatives follow readily from (~)' = eZ and (1)-(3);
dS in calculus,
(4)
(cos ;:.)'
(sin z)'
-sin?.
(tan z)' = sec 2
= cos z.
z,
etc. Equation (I) also shows that Euler's formula is valid ill complex:
eiz = cos;:. + i sin z
(5)
for all
z.
The real and imaginary parts of cos z and sin z are needed in computing values, and
they also help in displaying properties of our functions. We illustrate this with a typical
example.
E X AMP L E 1
Real and Imaginary Parts. Absolute Value. Periodicity
Show that
(a)
cos ~ = cos x cosh Y - i sin x sinh y
(b)
sin z. = sin x cosh y
(6)
+ i cos x sinh y
and
Icos :12 = cos2 x
(a)
+ sinh2 y
(7)
Ib)
and give some applications of these
Solution.
forrnula~.
From (1).
cos z =
~(ei(x+iYJ
+
= ~e -Y(cos x
= ~(eY
+
+ i sin x) + ~eY(cos X
e- Y)
This yields (6a) since. as is known fonn calculus,
(8)
e -i(x+iYJ)
cos x
-
~i(eY -
e- Y)
-
i sin xl
sinx.
628
CHAP. 13
Complex Numbers and Functions
(6b) is obtained similarly. From (6a) and cosh2 y = I
Icos zl2 ~ (cos2 x) (I
+ sinh2 y
we obtain
+ sinh2 y) + sin2 x
sinh2 y.
Since sin2 x + cos2 x = I, this gives (7a). and (7bl is obtained similarly.
For instance, cos (2 + 3i) = cos 2 cosh 3 - i sin 2 sinh 3 = -4.190 - 9.109i.
From (6) we see that cos z and sin z are periodic with period 2n, just as in real. Periodicity of tan;: and
cot z with period 7r now follows.
Formula (7) points to an essential difference between the real and the complex cosine and sine; whereas
Icos xl ~ I and Isin xl ~ I, the complex co~ine and sine functions are 110 10llger boullded but approach infinity
in absolute value as y --'> x, since then sinh y ~ 00 in (7).
•
E X AMP L E 2
Solutions of Equations. Zeros of cos z and sin z
Solve la) cos z = 5 (wluch has no real solution!), (b) cos z = 0, (e) sin z = o.
(a) e 2iz - 10iz + I = 0 from (1) by multiplication by e iz . This is a quadratic equation in e iz,
with solutions (rounded off to 3 decimals)
Solution.
i z = e -y+ix = 5
:'::
V25=""l ~
9.899
and
0.1O\.
Thus e- Y = 9.899 or 0.\01, e ix = I, Y = :'::2.292, x = 2wTi". AilS. Z ~ ±21l7r ± 2.292i (11 = 0, 1,2, .. ').
Can you obtain this from (6a)?
(b) cos x = 0, sinh y = 0 by (7a), y = O. Ans. z = ::':~(2n + 1)7r (11 = 0, 1,2, .. ').
(C) sin x = 0, sinh y = 0 by (7b). Ans. z = :'::1l7r (11 = 0, I, 2, .. '). Hence the only zeros of cos z and
sin;: are those of the real cosine and sine functions.
•
General formulas for the real trigonometric functions continue to hold for complex
values. This follows immediately from the definitions. We mention in particular the
addition rules
(9)
cos
(Zl
±
Z2) =
sin
(Zl
±
Z2)
cos Zl cos
Z2 =+=
sin Zl sin
Z2
= sin Zl cos Z2 ± sin Z2 cos Z]
and the fOilliula
cos 2 Z
(10)
+
sin2 Z
= 1.
Some further useful formulas are included in the problem set.
Hyperbolic Functions
The complex hyperbolic cosine and sine are defined by the formulas
(11)
This is suggested by the familiar definitions for a real variable [see (8)]. These functions
are entire, with derivatives
(12)
(cosh z)'
= sinh z,
(sinh z)'
= cosh z,
as in calculus. The other hyperbolic functions are defined by
SEC. 13.6
Trigonometric and Hyperbolic Functions
tanh z =
629
sinh.:
cosh z
coth z =
cosh z
sinh .:
(13)
sech z =
1
cosh z
,
csch z =
sinh z
If in (11), we replace
Complex Trigonometric alld Hyperbolic FUllctions Are Related.
z by iz and then use (1), we obtain
(14)
cosh iz
=
cos z;,
sinh iz
=
i sin z.
Similarly, if in (1) we replace z by i;:. and then use (II), we obtain conversely
(15)
cos iz = cosh z,
sin iz
=
i sinh z.
Here we have another case of unrelated real functions that have related complex analogs.
pointing again to the advantage of working in complex in order to get both a more unified
formalism and a deeper understanding of special functions. This is one of the main reasons
for the importance of complex analysis to the engineer and physicist.
1. Prove that cos z, sin z, cosh z, sinh Z are entire
functions.
2. Verify by differentiation that Re cos z and 1m sin z are
harmonic.
13-61 FORMULAS FOR HYPERBOLIC FUNCTIONS
Show that
14. sinh (4 - 3i)
15. cosh (4 - 67Ti)
16. (Real and imaginary parts) Show that
Re tan z
1m tan
3.
cosh z
sinh z
4. cosh (ZI
sinh (ZI
=
=
+ i sinh x sin y
sinh x cos y + i cosh x sin y.
cosh x cos Y
+ Z2) = cosh ZI cosh Z2 + sinh ZI sinh Z2
+ Z2) = sinh Zl cosh Z2 + cosh ZI sinh Z2'
5. cosh2 Z - sinh2 z.
6. cosh2 Z + sinh2 Z
=
=
z=
sinhy coshy
cos X + sinh y .
--=--'---'-=-2
2
117-211 Equations. Find all solutions of the following
equations.
17. cosh z = 0
18. sin z = 100
19. cos Z = 2i
20. cosh z = - 1
21. sinh z = 0
22. Find all z for which (a) cos z, (b) sin z has real values.
1
cosh 2z
17-151 Function Values. Compute (in the form u
7. cos(l + i)
8. sin(1 + i)
9. sin 5i, cos 5i
10. cos 37Ti
11. cosh (-2 + 3i), cos (-3 - 2i)
12. - i sinh (- 7T + 2i), sin (2 + 7Ti)
13. cosh (2n + 1)7Tl, n = 1,2, ...
sin x cos x
= --=-------,=-cos 2 X + sinh2 y ,
+ iv)
123-25] Equations and Inequalities. Using the
definitions, prove:
23. cos z is even. cos (-z) = cos z, and sin z is odd,
sin (-z) = -sin z.
24. Isinh yl ~ lcos zl ~ cosh y, Isinh yl ~ Isin zl ~ cosh y.
Conclude that the complex cosine and sine are not
bounded in the whole complex plane.
25. sin ZI cos Z2
=
Hsin (ZI
+
Z2)
+ sin (Zl
-
Z2)]
630
13.7
CHAP. 13
Complex Numbers and Functions
Logarithm. General Power
We finally introduce the complex logarithm, which is more complicated than the real
logarithm (which it includes as a special case) and historically puzzled mathematicians
for some time (so if you first get puzzled-which need not happen!-be patient and work
through this section with extra care).
The natural logarithm of z = x + iy is denoted by In z (sometimes also by log z) and
is defined as the inverse of the exponential function; that is, W = In z is defined for
z =1= 0 by the relation
(Note that z = 0 is impossible, since
w = u + iv and:: = reifl , this becomes
e
W
0 for all w; see Sec. 13.5.) [f we set
=1=
Now from Sec. 13.5 we know that eu + iv has the absolute value eU and the argument v.
These must be equal to the absolute value and argument on the right:
v
= 8.
eU = r gives u = In r, where In r is the familiar real natural logarithm of the positive
number r = Izi. Hence w = u + iv = In z is given by
(1)
In:: = In,.
+ i8
(r
=
Izl >
8 = arg z).
0,
Now comes an important point (without analog in real calculus). Since the argument of
z is determined only up to integer mUltiples of 271", the complex 1latural logarithm In z
(z
0) is i1lfi1litely many-valued.
The value of In:: conesponding to the principal value Arg z (see Sec. 13.2) is denoted
by Ln :: (Ln with capital L) and is called the principal value of In::. Thus
*
(2)
Ln z
= In Izl + i Arg z
(z =1= 0).
The uniqueness of Arg z for given z (=1= 0) implies that Ln z is single-valued, that is, a
function in the usual sense. Since the other values of arg :: differ by integer multiples of
271", the other values of In:: are given by
(3)
In z
= Ln z
::'::: 2n71"i
(n
=
1. 2.... ).
They all have the same real part, and their imaginary paJ1s differ by integer multiples of 271".
If:: is positive real, then Arg z = 0, and Ln z becomes identical with the real natural
logarithm known from calculus. If z is negative real (so that the natural logarithm of
calculus is not defined!), then Arg z = 71" and
Ln z = In
Izi + 71"i
(z negative real).
SEC 13.7
631
Logarithm. General Power
From (l) and e1n r = r for positive real r we obtain
(4a)
as expected, but since arg (e Z )
=
y ± 2nn is multi valued. so is
In (c) =
(4b)
E X AMP L E 1
Z
± 21lni,
n
= 0,
I,···.
Natural Logarithm. Principal Value
In 1 = 0, ±2wi, ±4wi, ...
Ln 1= 0
In 4 = 1.386 294 ± 211wi
Ln 4 = 1.386294
In (-1) = ± m, ±3w;. ±Swi, ..
Ln (-I) = wi
In (-4) = 1.386294 ± (211 + I)wi
Ln (-4) = 1.386294
In i = wil2. - 3 w/2. S wi12 . .•.
+ wi
Ln i = wi/2
In 4; = 1.386294 + wi/2 ± 21lwi
Ln 4; = 1.386 294 + wil2
In (-4i) = 1.386294 - wi/2 ± 21lwi
Ln (-4i) = 1.386 294 - wi/2
In (3 - 4i) = In S + i arg (3 - 4i)
=
Ln (3 - 4i) = 1.609438 - 0.927 29S;
(Fig. 334)
1.609438 - 0.927 29Si ± 21171'i
v
-0.9 + 6n
•
1
•
,
1
-0.9 + 4n
1
1
-0.9 + 2n
+
o I---....I....----il- - ' - -0.9
+2
u
1
-0.9 - 2n
Fig. 334.
•
1
Some values of In (3 - 4;) in Example 1
The familiar relations for the natural logarithm continue to hold for complex values,
that is.
(5)
(a)
In (~1::2) = In Zl + In ':2,
but these relations are to be understood in the sense that each value of one side is also
contained among the values of the other side: see the next example.
E X AMP L E 2
Illustration of the Functional Relation (5) in Complex
Let
;:1=;:2=e"1Ti=-1.
If we take the principal values
Ln;::1 = LnZ2 = wi.
then (Sa) holds provided we write In (:1:::2) = In I = 2wi; however. it is not true for the principal value,
•
Ln (ZIZ2) = Ln 1 = O.
632
CHAP. 13
THE 0 REM 1
Complex Numbers and Functions
Analyticity of the Logarithm
For el'ery n = 0, ::t::: 1, ::t:::2, ... formula (3) defines a function, which is analytic,
except at 0 and 011 the Ilegarire real axis, alld has the derivative
(6)
PROOF
(ln~)
,
I
(z not 0 or negative real).
=-
z.
We show that the Cauchy-Riemann equations are satisfied. From (I )-(3) we have
In z
= In r +
ice + c)
.!.2
In (x 2 + v
'
2
=
)
+ i(arctan I. + c)
x
where the constant c is a multiple of 27r. By differentiation,
ux
x
=
2
X
+ y 2 = Vy =
1
+
(ylx)
2
x
Hence the Cauchy-Riemann equations hold. [Confirm this by using these equations in
polar form. which we did not use since we proved them only in the problems (to
Sec. 13.4).j Formula (4) in Sec. 13.4 now gives (6).
(-
;~)
x - iy
=
x2
+ y2
z
•
Each of the infinitely many functions in (3) is called a branch of the logarithm. The
negative real axis is known as a branch cut and is usually graphed as shown in Fig. 335.
The branch for 11 = 0 is called the principal branch of In z.
Fig. 335.
Branch cut for In z
General Powers
General powers of a complex number z
(7)
Since In z is infinitely many-valued,
value
is called the principal value of zC.
= x + iy are defined by the formula
(c complex, z
ZC
oF 0).
will, in general, be multi valued. The particular
SEC. 13.7
633
Logarithm. General Power
If c = n = 1. 2, ... , then zn is single-valued and identical with the usual nth power
of z. If c = -1, -2, ... , the situation is similar.
If c = I/n, where n = 2, 3, ... , then
(z oF 0),
the exponent is determined up to multiples of 27Ti/n and we obtain the n distinct values
of the nth root, in agreement with the result in Sec. 13.2. [f c = p/q, the quotient of two
positive integers, the situation is similar, and ZC has only finitely many distinct values.
However, if c is real irrational or genuinely complex, then ZC is infinitely many-valued.
E X AMP L E 3
General Power
ji
=i
In i
= exp (i In i) = exp [i (
-i
j
± 2n17i) ] = e -(",/2)+2n".
All these values are real, and the principal value (n = 0) is e -",/2.
Similarly, by direct calculation and multiplying out in the exponent,
(I
+ i)2-i
=
exp [(2 - i) In (I
= 2e",/4"'2n"'[sin
+
i)] = exp [(2 - i) {In V2 + !17i ± 21l17il]
(~In 2)
•
+ i cos (~In 2)].
It is a convention that for real positive z = x the expression ZC mean:" e In x where In x
is the elementary real natural logarithm (that is, the principal value Ln z (z = x > 0) in
the sense of our definition). Also, if z = e, the base of the natural logarithm, ZC = eC is
conventionally regarded as the unique value obtained from (1) in Sec. l3.5.
From (7) we see that for any complex number a,
C
(8)
aZ = e" In a.
We have now introduced the complex functions needed in practical work. some of them
(e", cos z, sin z. cosh z, sinh z) entire (Sec. 13.5), some of them (tan z, cot z, tanh z. coth z)
analytic except at certain points, and one of them (In z) splitting up into infinitely many
functions, each analytic except at 0 and on the negative real axis.
For the inverse trigonometric and hyperbolic functions see the problem set.
~-il
Principal Value Ln z. Find Ln z when z equals:
1. - to
2. 2 + 2;
3. 2 - 2i
4. -5 ~ O.li
5. -3 - 4i
6. -100
7. 0.6 + O.Si
B. -ei
9. 1 - i
110-161 All Values of In z. Find all values and graph
some of them in the complex plane.
10. In 1
n. In (-I)
13. In (-6)
12. In e
14. In (4
+ 3i)
15. In (-e-')
16. In (e 3i )
17. Show that the set of values of In (i2) differs from the
set of values of 2 In i.
11B-21/
lB. In
Equations. Solve for z:
z = (2 -
!i)7T
20.lnz=e-7Ti
19. In z
=
0.3 + 0.7;
21. In z
=
2
+ ~7Ti
CHAP. 13
634
\22-2S \
22.
24.
26.
2S.
General Powers. Showing the details of your
work, find the principal value of:
i2i, (2i)i
(l -
Complex Numbers and Functions
i)l+i
(-I )1-2i
23. 4 3 + i
25. (l +
27. ;112
Yz2 -
(a) arccos.;: = -i In (;: +
+ ~)
(b) arcsin z: = -i In (i.;:
(c) arccosh z = In (z:
il-'
(d) arcsinh;:: = In (z
1)
+ -w-=-I)
+ W+l)
(3 - 4i)1I3
i
i
+
Z
(e) arctan;:: = - In - 2
i - ::
29. How can you find the answer to Prob. 24 from the
answer to Prob. 25?
30. TEAM PROJECT. Inverse Trigonometric and
Hyperbolic Functions. By definition. the inverse sine
w = arcsin z is the relation such that sin w = z. The
inverse cosine w = arccos:: is the relation such that
cos W = ::. The in,-erse tangent, inverse cotangent,
innrse hyperbolic sine, etc .. are defined and denoted
in a similar fashion. (Note that all these relations are
mllitivailled.) Using sin w = (e i '" - e- iW )/(2i) and
similar representations of cos w, etc .. show that
<n
I
arctanh;:: =
2"
I
+z
z
In 1 -
(g) Show that w = arcsin:: is infinitely many-valued.
and if WI is one of these values, the others are of the
form WI ~ 11l7T and 7T - WI ~ 21l7T, 11 = 0, I, ....
(The principal mlue of w = u + iv = arcsin z is
defined to be the value for which -7T!2 ;;; U ;;; 7T!2
if v ~ 0 and - 7T!2 < 1I < 7T!2 if v < 0.)
-C-l-FA PT-"E~33_ R E-Vl..E-w=:QlJ EST ION SAN D PRO B L EMS
1. Add. subtract. multiply. and divide 26 3 + 4i as well as their complex conjugates.
7i and
2. Write the two given numbers in Prob. I in polar form.
Find the principal value of their arguments.
3. What is the triangle inequality? Its geometric meaning?
Its significance?
4. If you know the values of {,fl, how do you get from
them the values of ~ for any;:?
5. State the definition of the derivative from memory. It
looks similar to that in calculus. But what is the big
difference?
6. What is an analytic function? How would you test for
anal yticity?
7. Can a function be differentiable at a pomt without being
analytic there? If yes, give an example.
[16-2D
Complex Numbers. Find, in the fonn x
showing the details:
16. (1 + ;)12
17. (- 2 + 6;)2
IS. 1/(3 - 7i)
20.
\/-5 -
12i
19. (l - ;)/(1
+
+ iy.
i)2
21. (43 - 19i)/(8 + ;)
122-261 Polar Form. Represent in polar form. with the
principal argument:
23. -6 + 6i
22. 1 - 3i
25. -12i
24. YW/(4 + 2i)
26. 2
+
2;
\27-30\
Roots. Find and graph all values of
27. V&
29.~
2S. V'256
30. VC"32-:------::-24-i
[31-~
9. State the definitions of eZ , cos z. sin ;::. cosh z. sinh;:: and
the relations between these functions. Do these relations
have analogs in real?
Analytic Functions. Find f(.::) = u(x.y) + ;v(x.y)
with 1I or v as given. Check for analyticity.
31. 11 = x/(x 2 + y2)
32. v = e- 3x sin 3y
2
33. u = x - 2xy - y2 34. 1I = cos 1x cosh 2y
y2
35. v = e X2 sin 2xy
10. What properties of C are similar to those of eX ? Which
one is different?
@6-391 Harmonic Functions. Are the following
functiuns hannonic? If so, find a hannonic conjugate.
II. What is the fundamental region of eZ ? Its significance?
36. x 2 y2
3S. e- x / 2 cos!y
S. Are
1::1, .:. Re;::, 1m:: analytic? Give reason.
12. What is an entire function? Give examples.
13. Why is In z much more complicated than In x? Explain
from memory.
14. What is the principal value of In z?
15. How is the general power:;c defined? Give examples.
37. xy
39. x 2
+
y2
\40-451 Special Function Values. Find the values of
40. sin (3 + 47Ti)
41. sinh 47Ti
42. cos (57T + 2;)
43. Ln CO.8 + 0.6i)
44. tan (I + i)
45. cosh (I + 7Ti)
Summary of Chapter 13
635
Complex Numbers and Functions
For arithmetic operations with complex numbers
(l)
=
z
Izl
+
x
iy
=
re iIJ
=
r(cos e
+ i sin 8).
e = arctan (y/x). and for their representation in the complex
plane, see Secs. 13.1 and 13.2.
A complex function f(:;) = u(x, y) + iv(x. y) is analytic in a domain D if it has
a derivative (Sec. 13.3)
r=
= \lx2 + )'2,
t' (z) = lim
(2)
fez
+ Llz)
- fez)
Llz
.lz~o
everywhere in D. Also, fez) is analytic at a point z = '<:0 if it has a derivative in a
neighborhood of Zo (not merely at Zo itself).
If fez) is analytic in D. then u(x. y) and v (x. y) satisfy the (very important!)
Cauchy-Riemann equations (Sec. 13.4)
(3)
everywhere in D. Then
(4)
au
au
au
au
ax
dy ,
ay
ax
II
and v also satisfy Laplace's equation
+ Uyy =
U XX
0,
everywhere in D. If 1I(x, y) and u(x. y) are continuous and have continuous partial
derivatives in D that satisfy (3) in D. then fez) = u(x, y) + iu(x, y) is analytic in
D. See Sec. 13.4. (More on Laplace's equation and complex analysis follows in
Chap. 18.)
The complex exponential function (Sec. 13.5)
(5)
e
Z
= exp z
=
eX
(cos y + i sin y)
reduces to e T if z = x (y = 0). It is periodic with 27Ti and has the derivative eZ •
The trigonometric functions are (Sec. 13.6)
I.
cos z = - (en
2
+ e-
.
ZZ
)
= cos x cosh V - i sin x sinh \'
-
(6)
1.
.
sin z = - (eOZ - e- OZ ) = sin x cosh \' + i cos x sinh \'
2i
- -
and, furthermore,
tan z
= (sin z)Jcos z,
cot Z
= lItan:.
etc.
636
CHAP. 13
Complex Numbers and Functions
The hyperbolic functions are (Sec. 13.6)
(7)
cosh z =
!(eZ + e-Z) = cos iz,
etc. The functions (5)-(7) are entire, that is, analytic everywhere in the complex
plane.
The natural logarithm is (Sec. 13.7)
(8)
In z
= In Izl + i
arg z
=
In Izl
+i
Arg z ::!: 2117Ti
where z =F 0 and II = 0, 1, . . . . Arg z is the principal value of arg z, that is.
-7T < Arg z ~ 7T. We see that In z is infinitely many-valued. Taking n = 0 gives
the principal value Ln z of In z; thus Ln z = In Izl + i Arg z.
General powers are defined by (Sec. 13.7)
(9)
(c complex. z =F 0).
~CHAPTER
~7
/
14
Complex Integration
Two main reasons account for the importance of integration in the complex plane. The
practical reason is that complex integration can evaluate certain real integrals appearing
in applications that are not accessible by real integral calculus. The theoretical reason is
that some basic properties of analytic functions are difficult to prove by other methods.
A striking property of this type is the existence of higher derivatives of an analytic function.
Complex integration also plays a role in connection with special functions. such as the
gamma function (see [GRll. p. 255), the error function. various polynomials (see [GRIOD
and others. and the application of these functions in physics.
In this chapter we define and explain complex integrals. The most important result in
the chapter is Cauchy's integral theorem or the Callchy-Goursat theorem, as it is also
called (Sec. 14.2). It implies Cauchy's integral formula (Sec. 14.3), which in tum implies
the existence of all higher derivatives of an analytic function. Hence in this respeCl,
complex analytic functions behave much more simply than real-valued functions of real
variables, which may have derivatives only up to a certain order.
A further method of complex integration, known as integration by residues, and its
application to real integrals will need complex series and follows in Chap. 16.
Prerequisite: Chap. 13
References alld Answers to Prohlems: App. I Part D, App. 2.
14.1
Line Integral in the Complex Plane
As in calculus we distinguish between definite integrals and indefinite integrals or
antiderivatives. An indefinite integral is a function whose derivative equals a given
analytic function in a region. By inverting known differentiation formulas we may find
many types of indefinite integrals.
Complex definite integrals are called (complex) line integrals. They are wlitten
fc
fez) dz.
Here the integrand fez) is integrated over a given curve C or a portion of it (an arc. but
we shall say "curve" in either case, for simplicity). This curve C in the complex plane is
called the path of integration. We may represent C by a parametric representation
(1)
z(t)
= x(t) + iy(t)
(a ~ t ~ b).
637
638
CHAP. 14
Complex Integration
The sense of increasing t is called the positive sense on C, and we say that C is oriented
by (1).
For instance, 2(t) = t + 3it (0 ~ t ~ 2) gives a portion (a segment) of the line)' = 3x.
The function .:(t) = 4 cos t + 4i sin t (- 7T ~ t ~ 7T) represents the circle Izl = 4. and so
on. More examples follow below.
We assume C to be a smooth curve, that is, C has a continuous and nonzero derivative
.
d:
.
dt = x(t)
z(t) = -
+
.
..
i,,(t)
at each point. Geometrically this means that C has everywhere a continuously turning
tangent, as follows directly from the definition
•
z{t)
.
= hm
z(t
+
.It) - :(1)
(Fig. 336).
!1t
~t~O
Here we use a dot since a prime' denotes the derivative with respect to z.
Definition of the Complex Line Integral
This is similar to the method in calculus. Let C be a smooth curve in the complex plane
given by (1), and let fez) be a continuous function given (at least) at each point of C. We
now subdivide (we "partition") the interval a ~ t ~ b in (1) by points
where to <
points
tl
< ... <
tn'
To this subdivision there corresponds a subdivision of C by
Zn-l'
Zn (= Z)
(Fig. 337).
Z
/
m
z
z(t)
o
Fig. 336. Tangent vector i(t) of a curve C in the
complex plane given by z(t). The arrowhead on the
curve indicates the positive sense (sense of increasing t).
Fig. 337.
Complex line integral
where Zj = :(tj)' On each portion of subdivision of C we choose an arbitrary point, say,
a point (1 between Zo and 21 (that is, (1 = z(t) where t satisfies to ~ t ~ t 1), a point (2
between '::1 and 22' etc. Then we form the sum
n
(2)
where
m=1
We do this for each 11 = 2, 3, ... in a completely independent manner. but so that the
greatest 1!11ml = It.m- tm - 1 1 approaches zero as n ~ 00. This implies that the greatest
SEC. 14.1
Line Integral in the Complex plane
639
ILlZmI also approaches zero. Indeed, it cannot exceed the length of the arc of C from Zm-I
to Zm and the latter goes to zero since the arc length of the smooth curve C is a continuous
function of t. The limit of the sequence of complex numbers S2, S3 • ... thus obtained is
called the line integral (or simply the imegral) of f(z) over the path of integration C with
the oriemarion given by (l). This line integral is denoted by
I
(3)
c
f
or by
f(::) d::.,
c
f(z) d:
if C is a closed path (one whose terminal point Z coincides with its initial point
a circle or for a curve shaped like an 8).
::'0,
as for
General Assumption. All paths of integration for complex line integrals are assllmed to
be piecewise smooth, that is. they consist offinitely many smooth curves joined end to end.
Basic Properties Directly Implied by the Definition
1. Linearity. Integration is a linear operation, thar is, we can imegrare sums term by
term and can take out constant factors from under the imegral sign. This mean~ that
if the integrals of f 1 and .f2 over a path C exist, so does the integral of kIf 1 + k2f2
over the same path and
2. Sense reversal in integrating over the same path, from
Z.o (right), introduces a minus sign as shown,
I
(5)
z
Iz
::'0
to Z (left) and from Z to
Zo
f(:) dz
=
Zo
-
fez) dz.
3. Partitioning of path (see Fig. 338)
(6)
I
f(;:) d::
C
Fig. 338.
=
I
f(::.) d::.
C1
+
I
f(z) dz.
C2
Partitioning of path [formula (6)]
Existence of the Complex Line Integral
Our assumptions that f(:) is continuous and C is piecewise smooth imply the exi'itence
of the line integral (3). This can be seen as follows.
As in the preceding chapter let us write f(z) = H(X. y) -t iv(x, y). We also set
and
640
CHAP. 14
Complex Integration
Then (2) may be written
(7)
Sn
.L (u + iv)(·.h + i!l.Ym)
=
m
where u = u«(m, 7]",). v = V«(7m 7]",) and we sum over I7l from 1 to n. Performing the
multiplication. we may now split up Sn into four sums:
These sums are real. Since f is continuous, u and v are continuous. Hence, if we let n
approach infinity in the aforementioned way, then the greatest !l.xm and .lYm will approach
zero and each sum on the right becomes a real line integral:
lim SII =
(8)
n--+oc
f
f(z) dz
C
=
f
u dx -
C
f
v dy
C
+i
[f
C
u dy
+
f
v dX] .
C
This shows that under our assumptions on f and C the line integral (3) exists and its value
•
is independent of the choice of subdivisions and intermediate poinb (m'
First Evaluation Method:
Indefinite Integration and Substitution of Limits
This method is the analog of the evaluation of definite integrals in calculus by the
well-known formula
f
b
f(x) dx = F(b) - F(a)
[F' (x)
=
f(x)].
a
It is simpler than the next method. but it is suitable for analytic functions only. To formulate
it, we need the following concept of general interest.
A domain D is called simply connected if every simple closed curve (closed curve
without self-intersections) encloses only points of D.
For instance, a circular disk is simply connected, whereas an annulus (Sec. 13.3) is not
simply connected. (Explain!)
THEOREM 1
Indefinite Integration of Analytic Functions
Let f(;:.) be analytic in a simply connected domain D. Then there exists an
indefinite integral of f(::.) in the domain D, that is, an analytic function F(::.) such that
F' (::.) = f(::.) in D, and for all paths in D joining two poims 20 and ZI in D we have
f'f(z) dz
(9)
= F(ZI) -
F(zo)
[F' (z) =
f(z)].
20
(Note that we can write
those C from ::'0 to 21')
20
and ::'1 instead of C, since we get the same value for all
SEC. 14.1
Line Integral in the Complex plane
641
This theorem will be proved in the next section.
Simple connectedness is quite essential in Theorem I, as we shall ~ee in Example 5.
Since analytic functions are our main concern, and since differentiation formulas will often
help in finding F(z) for a given fez) = F' (z), the present method is of great practical interest.
If fez) is entire (Sec. 13.5), we can take for D the complex plane (which is certainly
simply connected).
EXAMPLE 1
EXAMPLE 2
f
I
f
1+i
_2
do
0-'3
Z3
.cos::: d::: = sin:::
S-3m
1
(1
3
= -
0
7Ti
-m
E X AMP L E 3
11 + i
1
= -
I'
I
To
•
+
2
•
•
2
= - - + - i
i)3
33
= 2 sin 7ri = 2i sinh r. = 23.097i
-m
e z/ 2
d:::
=
S- 3 m
2eZ / 2
8+m
=
e4 + mt2 )
2(e4-37Ti/2 -
=0
8+m
•
since eZ is periodic with period 2r.i.
E X AMP L E 4
I~i ~
= Ln i - Ln (-i) =
i ; - (-
i;) =
ir.. Here D is the cumplex plane without 0 and the negative
real axis (where Ln::: is not analytic). Obviously, D is a simply connected domain.
•
Second Evaluation Method:
Use of a Representation of a Path
This method is not restricted to analytic functions but applies to any continuous complex
function.
THEOREM 2
Integration by the Use of the Path
Let C be a piecewise smooth path, represented by Z
fez) be a continuous function on C. Then
f f(;:;)
(10)
z(t), where a 3 t 3 b. Let
f f[z(t)]Z(t) dt
b
d:
C
PROOF
=
=
a
The left side of (10) is given by (8) in tenns of real line integrals, and we show that the
right side of (10) also equals (8). We have;:; = x + iy, hence = + iY. We simply
wnte II for u[x(t), y(t)] and v for v[x(t). y(t)]. We also have d"t = dt and dy = Ydt.
Consequently, in (10)
z x
x
b
b
f f[z(t)]Z(t) dt = f (u +
= f [u dx -
iv)(.t
a
+ (),) dt
a
v dy
+
i(u dy
+ V dx)]
C
=
fc(u dx -
V
dy)
+
ifc (u dy +
V
dx).
•
642
CHAP. 14
Complex Integration
COMMENT. In (7) and (8) of the existence proof of the complex line integral we referred
to real line integrals. If one wants to avoid this, one can take (to) as a definition of the
complex line integraL
Steps in Applying Theorem 2
(A) Represent the path C in the form z(t) (a
~
t
~
b).
(B) Calculate the derivative z(t) = dz/dt.
(C) Substitute zlt) for every z in .«z) (hence x(t) for x and y(t) for y).
(D) Integrate .f[z(t)]z(t) over t from a to b.
E X AMP L E 5
A Basic Result: Integral of 1/z Around the Unit Circle
We show that by integrating
Sec. 13.3) we obtain
1I~
counterclockwise around the unit circle (the circle of radius I and center 0; see
J, dz = 21Ti
(11)
(C the unit circle,
counterclockwise).
Jc z
This is a vel)' importalll result that we shall need quite often.
Solutioll.
(A) We may represent the unit circle C in Fig. 327 of Sec. 13.3 by
z(t) = cos
T
+ i sin r =
eit
(0 ~
r
~
27T).
so that counterclockwise integration corresponds to an increase of t from 0 to 27T.
(B) Differentiation gives :(t) = ieit (chain rule!).
(C) By substitution. f(~(t) = 1I~(t) = e -it.
(D) From (10) we thus obtain the result
f
dz
- =
f 2.,,-..
f2"'-dt = 27Ti.
e -'tie't dt = i
c zoo
Check this result by using ~(tl = cos t + i sin t.
Simple cOllllectedlless is esselltial ill Tlzeorem 1. Equation (9) in Theorem I gives 0 for any closed path
because then ;:1 = ;:0, so that F(;:1) - F(;:o) = O. Now 1/;: is not analytiC at z = O. But any simply connected
domain containing the unit circle must contain ~ = 0, so that Theorem I does not apply-it is not enough that
liz is analytic in an annulus. say. ~ < Izl < i, because an annulus is not simply connected!
•
E X AMP L E 6
Integral of 1/z m with Integer Power m
Let fez) = (z - :Co)m where m is the integer and
of radius p with center at ~o (Fig. 339).
;:0
a constant. Integl'ate counterclockwise around the circle C
y
x
Fig. 339.
Path in Example 6
SEC. 14.1
643
Line Integral in the Complex Plane
Solutioll.
We may repre,ent C in the form
= :0
:e(t) = Zo + p(cos t + i sin t)
+ peit
(0 "'" t "'" 2'IT).
Then we have
t
dz = ipi dt
and obtain
By the Euler formula (5) in Sec. 13.6 the right side equals
fo
2w
ip>n+l
[
2w
]
COS(n1+l)tdt+if sin(m+l)rdt
o
.
'*
If 111 = - I. we have pm+l = I, cos 0 = 1, sin 0 = O. We thus obtain 2'ITi. For integer nI
1 each of the two
integrals is zero because we integrate over an interval of length 2'IT, equal to a period of sine and cosine. Hence
the result is
(m
=
(111
*" -) and integer).
(12)
-1),
•
Dependence on path. Now comes a very important fact. If we integrate a given function
J(z) from a point Zo to a point Zl along different paths, the integrals will in general have
different values. In other words. a complex lille illtegral depellds Ilot ollly Oil the elldpoillts
o/the path but ill gelleral also Oil the path itself. The next example gives a first impression
of this, and a systematic discussion follows in the next section.
E X AMP L E 7
Integral of a Nonanalytic Function. Dependence on Path
Integrate f(:)
= Re: = " from 0
to I + 2i (a) along C* in Fig. 340, (b) along C consisting of C1 and C 2 ·
y
2
z=1+2i
I
I
I
I
C'I
z
I
"
C
I
1
"
C
Fig. 340.
x
Paths in Example 7
Solutioll.
(a) C* can be represented by z(t) = r + 2it (0 "'" t"", I). Hence z(t) = I + 2i and f[z(t)] = xCt) = t
on C*. We now calculate
f
Re z d::: =
c*
f
1
t(l + 2i) dt
= ~(I
+ 2i)
=!
+ i.
0
(b) We now have
C1 : z(t) = t.
itt)
C2 : Zll) = I + it,
~(t) = i,
=
I,
f(::(t))
= x(t) =
t
f(zv» = X(I) = 1
(0 "'" t "'" 1)
(0"", t"", 2).
CHAP. 14
644
Complex Integration
Using (6) we calculate
f
Re.: dz =
C
f
Re:: dz +
C1
f
Re:: dz =
C2
f
1
2
t dt + f l . i dt
0
=
~ + 2i.
0
•
Note that this Tesult diffeTs from the result in (a).
Bounds for Integrals. ML -Inequality
There will be a frequent need for estimating the absolute value of complex line integrals.
The basic formula is
II/(z) dzl ~
(13)
ML
(ML-inequality);
L is the length of C and M a constant such that If(z) 1 ~ M everywhere on C.
PROOF
Taking the absolute value in (2) and applying the generalized inequality (6*) in Sec. 13.2.
we obtain
Now ILlZml is the length of the chord whose endpoints are Z",,-l and Zm (see Fig. 337 on
p. 638). Hence the sum on the right represents the length L * of the broken line of chords
whose endpoints are zo, Zl, •.• , Zn (= Z). If n approaches infinity in such a way that the
greatest ILltml and thus ILlZ'>n1 approach zero, then L * approaches the length L of the curve
C, by the definition of the length of a curve. From this the inequality (13) follows.
•
We cannot see from (13) how close to the bound ML the actual absolute value of the
integral is, but this will be no handicap in applying (13). For the time being we explain
the practical use of (13) by a simple example.
EXAMPLE 8
'~
Estimation of an Integral
Find an UppeT bound fOT the absolute value of the integral
C the straight-line segment from 0 to I + i. Fig. 341.
Solution.
L =
V2 and If(z) 1= Iz21 ~ 2 on C gives by (13)
Fig.341. Path in
Example 8
IIc I~ 2V2
Z2
I
dz
I
The absolute value of the imegral is - -2 + -2 i = -2
333
=
2.8284.
Vz =
0.9428 (see Example I).
•
Summary on Integration. Line integrals of .Hz) can always be evaluated by (10), using
a representation (I) of the path of integration. If f(z) is analytic, indefinite integration by
(9) as in calculus will be simpler.
SEC. 14.1
645
Line Integral in the Complex plane
. .....
.- .
.
---- -- .......
PARAMETRIC REPRESENTATIONS
11-91
26.
Find and sketch the path and its orientation given by:
1. zU) = (l
+
3i)t (1
~
2. :.:(1) = 5 - 2it (- 3
3. zU) = 4
4. z(t)
5.
= }
~
4)
7. :.:(1)
+ 4i
+ 5e
(7T ~ 1 ~
=
1)
+i
+
ib to c
x 2 to
f
sec 2
:.:
d:.:, C any path from 7T/4 to 7Ti14
Tm Z2 dz counterclockwise around the triangle with
c
vertices:.: = 0, I, i
:.:e z2/2 d:.:, C from i along the axes to I
Z2,
where Cis
31. (Path partitioning) Verify (6) for f(:.:) = 11:::: and C 1
and C2 the upper and lower halfs of the unit circle
to 4 - 2i
11. Unit circle (c1ocl\.wise)
12. Segment from a
l' =
i
30. (Sense reversal) Verify (5) for fez) =
the segment from -1 - i to I + i.
Sketch and represent parametrically:
10. Segment from I
i along the parabola
c
PARAMETRIC REPRESENTATIONS
110-181
29.
27T)
+ 2t + 8il 2 (-1 ~ 1 ~
t + !it 3 (-1 ~ 1 ~ 2)
+
c
~ T ~ 7T)
:.:(t) = I
9. :.:(t)
28.
f
f
27.
e-.,.it (0 ~ 1 ~ 2)
6 cos 21 + 5i sin 2t (0
=
+
I
1 ~ 3)
3e it (0 ~ t ~ 27T)
it
dz, C from -}
C
z(t) = e it (0 ~ t ~ 7T)
6. :.:(1) = 3
8.
+i +
+i+
t
~
fz
32. (ML-inequality) Find an upper bound of the absolute
value of the integral in Prob. 19.
+ id
+ i to 4 + ~i
13. Hyperbola xy = 1 from I
14. Semi-ellipse x 21a 2 + y21b 2 = 1. Y ~ 0
33. (Linearity) Illustrate (4) with an example of your own.
Prove (4).
15. Parabola y = 4 - 4x 2 (-I ~ x ~ 1)
34. TEAM PROJECT. Integration. (a) Comparison.
Write a short repOit comparing the essential points of
the two integration methods.
+ 3il
16. I:.: - 2
17.
Iz + a + ibl
=
4 (counterclockwise)
= r
(clockwise)
18. Ellipse 4(x - 1)2 + 9(y + 2)2 = 36
(b) Comparison. Evaluate
Integrdte by the first method or state why it does not apply
and then use the second method. (Show the details of your
work.)
19.
f
f
f
f
fc
Re:.: d:.:, C the shortest path from 0 to I
+
i
c
20.
Re z dz, C the parabola y = x 2 from 0 to I
+
i
c
21.
e 2z d:.:, C the shonest path from
7Ti
to 27Ti
c
22.
sin z dz, C any path from 0 to 2i
c
23.
cos 2
:.:
d:.: from -7Ti along
Izl
=
7T
to
7Ti
in the right
half-plane
24.
f
f
c
25.
c
(z
+
l
and check the result by Theorem 2, where:
INTEGRATION
119-291
I/(:::) d::: by Theorem
[1) dz, C the unit circle (counterclockwise)
cosh 4:.: d:.:, C any path from - 7Ti/8 to 71i/8
= ::4 and C is the semicircle Izi
- 2i to 2i in the right half-plane,
(i) f(::.)
= 2 from
(ii) f(:.:) = e2z and C is the shonest path from 0
to 1 + 2i.
(e) Continuous deformation of path. Experiment
with a family of paths with common endpoints, say,
z(t) = 1 + ia sin t, 0 ~ t ~ 71. with real parameter a.
Integrate nonanalytic functions (Re:c, Re (:.:2), etc.) and
explore how the result depends on a. Then take analytic
functions of your choice. (Show the details of your
work.) Compare and comment.
(d) Continuuus deformation of path. Choose
another
family.
for
example.
semi-ellipses
z(t) = a cos 1 + i sin I, -7T/2 ~ t ~ 71'/2, and
experiment as in (c).
35. CAS PROJECT. Integration. Write programs for the
two integration methods. Apply them to problems of
your choice. Could you make them into a joint program
that also decides which of the two methods to use in a
given case?
646
14.2
CHAP. 14
Complex Integration
Cauchy's Integral Theorem
We have just seen in Sec. 14.1 that a line integral of a function fez) generally depends
not merely on the endpoints of the path, but also on the choice of the path itself. This
dependence often complicates situations. Hence conditions under which this does not
occur are of considerable importance. Namely. if .Hz) is analytic in a domain D and D is
simply connected (see Sec. 14.1 and also below), then the integral will not depend on the
choice of a path between given points. This result (Theorem 2) follows from Cauchy's
integral theorem. along with other basic consequences that make Cauchy's integral
theorem the most importallt theorem in this chapter and fundamental throughout complex
analysis.
Let us begin by repeating and illustrating the definition of simple connectedness
(Sec. 14.1) and adding some more details.
1. A simple closed path is a closed path (Sec. 14.1) thaJ does not intersect or touch
itself (Fig. 342). For example, a circle is simple, but a curve shaped like an 8 is not
simple.
(
\
Simple
Not simple
Not simple
Simple
Fig. 342.
Closed paths
2. A simply connected domain D in the complex plane is a domain (Sec. 13.3) such
that every simple closed path in D encloses only points of D. Examples: The interior
of a circle ("open disk"). ellipse. or any simple closed curve. A domain that is not
simply connected is called mUltiply connected. Examples: An annulus (Sec. 13.3),
a disk without the center, for example, 0 < Izl < 1. See also Fig. 343 .
.. -, ...
,
,,
'"
," .../,'
, '" ,,
, '-'
""", " ., : ,'--', \
-, "
.,_---"!!Ir
I
,
,
",
I
I
I
\
\
,-,
"
' ......... _."
,
.
,'
" ....... _-' "
~
I.
\
,.,-.".-- ..... ,
,
,
1
,
I
I
,""
'-',
"
'\
','"
" ' .... _--'--,,'
.... '
'\
Simply
connected
Simply
connected
Fig. 343.
'\
\
I
"
\
I
"..,
' ... _..... '
....
I
I
_----
"
I
I
",
I
"
•
,,
, ... ---"."
,"'-,
I
,
,
.
'\
I
,,
, ' .. --' ,,--, ,I
I
I ,
\
'..
Doubly
connected
,
,
"
I "
,-_/,
I
' ..... _----'"
Triply
connected
Simply and multiply connected domains
More precisely, a bounded domain D (thal is, a domain that lies entirely in some circle about the origin) is
called p-fold connected if its boundary consists of p closed connected sets without common points. These sets
can be curves, segments, or single points (such as z = 0 for 0 < Izl < I. for which p = 2). Thus, D has p - I
"holes", where "hole" may also mean a segment or even a single point. Hence an annulus is doubly connected
(p = 2).
SEC 14.2
Cauchy's Integral Theorem
THEOREM 1
647
Cauchy's Integral Theorem
Iff(z) is analytic in a simply connected domain D. tllenfor every simple closed path
C in D.
f
(1)
c
fez) dz = O.
See Fig. 344.
--- -----
"/'-0 "
...
...
\
I
,,-'
-' -'
,,-'
I
/
/
:
D
"
.....
C
_---- ------------'
Fig. 344.
,,/
Cauchy's integral theorem
Before we prove the theorem. let us consider some examples in order to really understand
what is going on. A simple closed path is sometimes called a colltour and an integral over
such a path a contour integral. Thus, (1) and our examples involve contour integrals.
E X AMP L E 1
No Singularities (Entire Functions)
fez
dz
c
=
O.
f
(n =
cos zdz = 0,
for any closed path, since these functions are entire (analytic for all
E X AMP L E 2
0, 1.... )
c
•
~).
Singularities Outside the Contour
f
sec
c
~d;: =
J,
O.
Jc
d::.
<.2
+4
= 0
where C is the unit circle, sec z = IIcos <. is not analytic at ;;: = ± 7rf2, ±37Tf2 • ... , but all these points lie
outside C; none lies on C or inside C. Similarly for the second integral, whose integrand is not analytic at
z = ::':2i outside C.
•
E X AMP L E 3
Nonanalytic Function
f:: = I
2,,-
e-itiit dt
d;:.
C
where C: ~(t)
analytic.
E X AMP L E 4
=
= 27ri
0
e it is the unit circle. This does not contradict Cauchy's theorem because f(z)
= :: is
not
•
Analyticity Sufficient, Not Necessary
J,
r
dz
7
C~
=
U
2
where C is the unit circle. This result does not follow from Cauchy'~ theorem. because f(::;) = 1/;:.2 is not analytic
at z = O. Hence the condition that f be analytic ill D is sujficiellf rather thall neces.mr\' for ( I) to be true. •
648
CHAP. 14
E X AMP L E 5
Complex Integration
Simple Connectedness Essential
d:
rJ,c --== 27ri
-
for counterclockwise imegrarion around rhe unit circle (see Sec. 14.1). C lies in the annulus ~ < 1:1 < ~ where
If: is analytic. but this domain is not simply connected. so that Cauchy"s theorem cannot be applied. Hence the
condition that tile doma;'1 D be simply connected is essential.
In other word,. by Cauchy's theorem. if II:) is analytic on a simple closed path C and everywhere inside C,
with no exception. not even a single point. then (I) holds. The point that causes trouble here is : = 0 where If:
is not analytic
•
PROOF
Cauchy proved his integral theorem under the additional assumption that the derivative
f' (z) is continuous (which is true. but would need an extra proof). His proof proceeds as
follows. From (8) in Sec. 14.1 we have
fc
f(z) dz
=
fc
(u dx - v dy)
+
i
fc
(u dy
+
v dx).
Since .f(z) is analytic in D, its derivative .f' (z) exists in D. Since .f' (z) is assumed to be
continuous, (4) and (5) in Sec. 13.4 imply that u and v have continuous partial derivatives
in D. Hence Green's theorem (Sec. 10.4) (with u and -v instead ofF! and F 2 ) is applicable
and gives
f
(u dx - v dy) =
C
I I (- ~ax - ~)
a)
dr dy
R
where R is the region bounded by C. The second Cauchy-Riemann equation (Sec. 13.4)
shows that the integrand on the right is identically zero. Hence the integral on the left is
zero. In the same fashion it follows by the use of the first Cauchy-Riemann equation that
•
the last integral in the above formula is zero. This completes Cauchy's proof.
Goursat's proof without the cOllditioll that f' (z) is cOlltillllOUS 1 is much more
complicated. We leave it optional and include it in App. 4.
Independence of Path
We know from the preceding section that the value of a line integral of a given function
Z1 to a point Z2 will in general depend on the path C over which we
integrate, not merely on Z1 and Z2' It is imp0l1ant to characterize situations in which this
difficulty of path dependence does not occur. This task suggests the following concept.
We call an integral of .f(z) independent of path in a domain D if for every ::::1, Z2 in D
its value depends (besides on f(::::), of course) only on the initial point ::::1 and the terminal
point Z2, but not on the choice of the path C in D [so that every path in D from Z1 to ::::2
gives the same value of the integral of f(z)].
.f(z) from a point
ItDOUARD GOURSAT (1858-1936). French mathematician. Cauchy published rhe theorem in 1825. The
removal of that condition by GourSal (see Transactions Amer. Math. Soc.. vol. I. 1900) is quite important. for
instance, in connection with the fact thai derivatives of analytic functions are also analytic. as we shall prove
soon. Goursat also made important contributions to PDEs.
SEC. 14.2
649
Cauchy's Integral Theorem
THEOREM 2
Independence of Path
If fez) is analytic in a simply connected domain D, then the integral of .f(z) is
independent of path in D.
PROOF
Let ZI and Z2 be any points in D. Consider two paths C 1 and C 2 in D from Zl to Z2 without
further common points, as in Fig. 345. Denote by ci the path C2 with the orientation
reserved (Fig. 346). Integrate from :::1 over C 1 to ;:2 and over C~ back to Zl' This is a
simple closed path, and Cauchy's theorem applies under our assumptions of the present
theorem and gives zero:
(2')
I f dz + I
el
c~
f
dz
I f dz
thus
= 0,
C,
=
-
I
c,;
f
dz.
But the minus sign on the right disappears if we integrate in the reverse direction, from
to Z2, which shows that the integrals of fez) over C 1 and C2 are equal.
ZI
Ic,
(2)
fC:;) dz
=
Ic,
(Fig. 345).
.f(z) dz
This proves the theorem for paths that have only the endpoints in common. For paths that
have finitely many further common points, apply the present argument to each "loop"
(portions of C1 and C2 between consecutive common points; four loops in Fig. 347). For
paths with infinitely many commOn points we would need additional argumentation not
•
to be presented here.
Fig. 345.
Formula (2)
Fig. 346.
Formula (2')
Paths with more
common points
Fig. 347.
Principle of Deformation of Path
This idea is related to path independence. We may imagine that the path C2 in (2) was
obtained from C 1 by continuously moving C 1 (with ends fixed!) until it coincides with
C2 . Figure 348 shows two of the infinitely many intermediate paths for which the integral
always retains its value (because of Theorem 2). Hence we may impose a continuous
deformation of the path of an integral, keeping the ends fixed. As long as our deforming
path always contains only points at which .f(z) is analytic, the integral retains the same
value. This is called the principle of deformation of path.
650
CHAP. 14
Complex Integration
c)
------, ...." ,
...
---~....
",
,
\
\
\
...
\
\
\
I
\
I
. . . . . . . \1
2)
Fig. 348.
E X AMP L E 6
Continuous deformation of path
A Basic Result: Integral of Integer Powers
From Example 6 in Sec. 14.1 and the principle of deformation of path it follows that
f
(3)
zor d"
(z -
=
2m
{ 0
(m = -I)
(m
'* - I and integer)
for counterclockwise integration around allY simple closed path cOlltaillillg Zo ill its illterior.
Indeed. the circle Iz - .::01 = P in Example 6 of Sec. 14.1 can be continuously defonned in two steps into a path
as just indicated. namely. by first defomling. say. one semicircle and then the other one. (Make a sketch).
•
Existence of Indefinite Integral
We shall now justify our indefinite integration method in the preceding section [formula
(9) in Sec. 14.1]. The proof will need Cauchy's integral theorem.
Existence of Indefinite Integral
THEOREM 3
If f(::.)
is analytic in a simply c01lnected domain D, tben there exists an indefinite
integral F(z) of f(z) in D-thus, F'(z) = f(z)-which is analytic in D, and for all
paths in D joining any two points ::'0 alld ::'1 in D, the integral of f(z) from ::'0 to ZI
call be evaluated by fOl1llula (9) in Sec. 14.1.
PROOF
The conditions of Cauchy's integral theorem are satisfied. Hence the line integral of f(z)
from any Zu in D to any z in D is independent of path in D. We keep Zo fixed. Then this
integral becomes a function of z. call if F(z},
(4)
=
F(z)
r
f(z*) dz*
Zo
which is uniquely detennined. We show that this F(z) is analytic in D and F' (z) = .f(z).
The idea of doing this is as follows. Using (4) we form the difference quotient
F(z
(5)
+
.lz) - F(z)
ll.'-.
=
1
A
LlZ
[Z+l1Z
I
Zo
f(z*) dz* -
IZ
f(::.*) d::.*
Zo
]
=
1
A _
Ll-<,
f
z+!lz
f(z*) dz*.
Z
We now ~ubtract f(z) from (5) and show that the resulting expression approaches zero as
ll.z ~ O. The details are as follows.
SEC. 14.2
651
Cauchy's Integral Theorem
We keep z fixed. Then we choose z + fl.z in D so that the whole segment with
endpoints z and z + fl.z is in D (Fig. 349). This can be done because D is a domain,
hence it contains a neighborhood of z. We use this segment as the path of integration
in (5). Now we subtract fez). This is a constant because z is kept fixed. Hence we can
write
J
J
Z+.'1z
1
z+.'1z
fez) dz*
= fez)
z
= fez) fl.z.
dz*
Thus
fez)
=
z
A
tiZ
J
z+.'1z
fez) dz*.
z
By this trick and from (5) we get a single integral:
F(z
+
fl.z) - F(z)
fl.
-
z
1 f
tiZ
Since .f(z) is analytic, it is continuous. An
such that I.f(z*) - f(z) 1< E when Iz* - zl
ML-inequality (Sec. 14.1) yields
I
F(z
+
I
fl.z) - F(z)
fl.z
- fez)
1
=
Ifl.zl
Z
= A
f(::.)
If
+.'1Z
.
[f(z*) - fez)] dz"'.
Z
> 0 being given, we can thus find a 8 > 0
< 8. Hence. letting Ifl.zl < 8. we see that the
E
Z
+.'1Z
z
[.f(z*) - fez)] d-;;*
I
:S
By the definition of limit and derivative, this proves that
,
.
F(z
F (z) = hm
+
.'1z->O
fl.z) - F(z)
=
fl.::.
.f{z) .
Since Z is any point in D, this implies that F(z) is analytic in D and is an indefinite integral
or antiderivative of f(z) in D, written
F(z)
=
ff(z) dz.
Also, if c' (z) = fez), then F' (z) - c' (z) ~ 0 in D; hence F(z) - C(z) is constant in D
(see Team Project 26 in Problem Set 13.4). That is, two indefinite integrals of fez) can
differ only by a constant. The latter drops out in (9) of Sec. 14.1, so that we can use any
•
indefinite integral of fez). This proves Theorem 3.
-------- ,
.....
",,,,,,,,,,,,,,,,,,,
/
/
,
"
'"
/
\
\
D
",/
-'"
I
'"
Zo
",,,,,,,,,,,,,,,,
\,----'"
Fig. 349.
,
I
I
"
I
z + t.z
Path of integration
652
CHAP. 14
Complex Integration
Cauchy's Integral Theorem for
Multiply Connected Domains
Cauchy's theorem applies to multiply connected domains. We first explain this for a
doubly connected domain D with outer boundary curve C1 and inner C2 (Fig. 350). If
a function fez) is analytic in any domain D* that contains D and its boundary curves, we
claim that
f
(6)
fez) d::. =
C1
f
(Fig. 350)
fez) dz
C2
both integrals being taken counterclockwise (or both clockwise, and regardless of whether
or not the full interior of C2 belongs to D*).
Fig. 350.
PROOF
By two cuts
Paths in (5)
C\ and C2
(Fig. 351) we cut D into two simply connected domains Dl and
.Hz) is analytic. By Cauchy's integral theorem the
integral over the entire boundary of Dl (taken in the sense of the arrows in Fig. 351) is
zero, and so is the integral over the boundary of D 2 , and thus their sum. [n this sum the
integrals over the cuts C1 and C2 cancel because we integrate over them in both
directions-this is the key-and we are left with the integrals over C1 (counterclockwise)
and C2 (clockwise; see Fig. 351); hence by reversing the integration over C2 (to
counterclockwise) we have
D2 in which and on whose boundaries
f
C1
f d z - f fd::;=O
C2
•
and (6) follows.
For domains of higher connectivity the idea remains the same. Thus, for a triply connected
domain we use three cuts Cb C2, C3 (Fig. 352). Adding integrals as before, the integrals
over the cuts cancel and the sum of the integrals over C1 (counterclockwise) and C2 , C3
(clockwise) is zero. Hence the integral over C1 equals the sum of the integrals over C2
and C3 , all three now taken counterclockwise. Similarly for quadruply connected domains,
and so on.
-
~
Ldr~~c
Cl'~~J.2
D2
Fig. 351.
C
CI
j
Doubly connected domain
Fig. 352.
Triply connected domain
SEC. 14.2
Cauchy's Integral Theorem
653
CAUCHY'S INTEGRAL THEOREM
APPLICABLE?
2::
f(::) =
(i)
_2
+ 3i
+1
-
Integrate f(::) counterclockwise around the unit circle.
indicating whether Cauchy's integral theorem applies.
(Show the details of your work.)
1. f(::) = Re::
3. f(::)
= ez
6. f(::) = sec (::/2)
7. f(::)
=
11(::8 - 1.2)
9. f(::)
=
1/(21<:13)
11 •.f(z)
=
.:2 cot .:
112-171
= II:
4. f(::)
/2
5. f(::) = tan:: 2
8. f(::) = 1/(4z - 3)
10. f(::)
l
=
119-301
COMMENTS ON TEXT AND EXAMPLES
12. (Singularities) Can we conclude in Example 2 that
the integral of 11(::2 + 4) taken over (a) Iz - 21 = 2,
(b) I:: - 21 = 3 is zero? Give reasons.
13. (Cauchy's integral theorem) Velify Theorem 1 for
the integral of ::2 over the boundary of the square
with vertices I + i, -I + i. -I - i, and I - i
(counterclockwise).
f(::) =
4
z+I
-=---:;:2 + 2::
(c) Deformation of path. Review (c) and (d) of Team
Project 34, Sec. 14.\. in the light of the principle of
deformation of path. Then consider another family of
paths with common endpoints. say, ::(t) = r + ia(r - (2).
o~ (~ 1. and experiment with the integration of analytic
and nonanalytic functions of your choice over these paths
(e.g., ::. 1m::. ::2, Re .:2, 1m Z2, etc).
2. f(::) = 11(3:: - 1Ii)
2
(ii)
FURTHER CONTOUR INTEGRALS
Evaluate (showing the details and using partial fractions if
necessary)
19. ,.(
'f
2-d::_ i . C the circle
c -
20.
21.
f
f
c
tanh::: d::, C the circle
IIz
=
Iz -
3 (counterclockwise)
!7Til =
~ (clockwise)
Re 2::: d:;:, C as shown
c
14. (Cauchy's integral theorem) For what contours C will
it follow from Theorem I that
(a)
f
,.(
d:: = 0,
c ::
(c)
(b)
'f
cos :::
_6 _
c-
fc::
_2
c
d:: = O.
-
-2--
+9
1 x
-1
elfz
d::
=
O?
15. (Deformation principle) Can we conclude from
Example 4 that the integral is also zero over the contour
in Problem 13?
22.
f
7z - 6
_2 _
c~
2- d::, C as shown
-
_--_c
16. (Deformation principle) If the integral of a function
fez) over the unit circle equals 3 and over the circle
Izl = 2 equals 5, can we conclude that fez) is analytic
everywhere in the annulus I < Izl < 2?
17. (Path independence) Verify Theorem 2 for [he
integral of cos:: from 0 to (l + i}7T(a) overthe shortest
path. (b) over the x-axis to 7T and then straight up to
(l + i)7T.
x
23.,.(
2 d::
Jcz -
I
, C as shown
y
18. TEAM PROJECT. Cauchy's Integral Theorem.
(a) Main Aspects. Each of the problems in Examples
1-5 explains a basic fact in connection with Cauchy's
theorem. Find five examples of your own, more
complicated ones if possible. each illustrating one of
those facts.
(b) Partial fractions. Write f(::) in terms of partial
fractions and integrate it counterclockwise over the unit
circle, where
x
24.
,.( e
2z
'f -4- d::. C consists of 1:::/
c
(counterclockwise)
o.
=
2 (clockwise) and /<:/ =
f
CHAP. 14
654
25.
cos
rJ:c ~
dz, C consists of Izl =
(.
7
and
26.
Complex Integration
Izi
f Ln
+ ::) d::, C the boundary of the square with
c
vertices
27.
J:
2
d::
Jcz +
•
C: (a)
Izi
!.
(b)
Iz - il
3
2:
(counterclockwise)
14.3
-2-- ,
29.
f
30.
f
:!: 1, :!: i
1
rJ:c:: d::+ 1
C:
(a) Iz + il
I,
(b)
Iz - il
(counterclockwise)
3 (clockwise)
=
(2
28.
1 (counterclockwise)
-sin::
- . d:;., C:
c:: + 21
tan (::/2)
4
16
veltices :!: 1,
cZ
-
Iz -
d::., C
~i
4 - 2i I = 5.5 (clockwise)
the boundary of the square with
(clockwise)
Cauchy's Integral Formula
The most important consequence of Cauchy's integral theorem is Cauchy's integral
formula. This formula is useful for evaluating integrals, as we show below. Even more
important is its key role in proving the surprising fact that analytic functions have
derivatives of all orders (Sec. 14.4), in esrablishing Taylor series representations
(Sec. 15.4), and so on. Cauchy's integral formula and irs conditions of validity may be
stated as follows.
Cauchy's Integral Formula
THEOREM 1
Let fez) be analytic il1 a simply connected domain D. Then for allY POi11T
alld any simple closed path C in D that encloses Zo (Fig. 353),
(1)
f(::)
rJ:C<·-_-_dz =
27Tif(zo)
':0
ill D
(Cauchy's integral formula)
---0
the integration being taken cuullterc!ockwise. Alternatively (for representing f(zo)
by a contour integral, divide (I) by 27Ti),
(1*)
PROOF
f(zo)
1
27Tl
= -.
fez)
rJ:C --.
dz
Z - Zo
(Cauchy's integral formula).
By addition and subtraction, fez) = f(zo) + [fez) - fC2{)]. Inserting this into (l) on the
left and taking the constant factor f(.::o) out from under the integral sign, we have
(2)
The first term on the right equals f(;:.o)· 27Ti (see Example 6 in Sec. 14.2 with 111 = - I).
This proves the theorem. provided the second integral on the right is zero. This is what
we are now going to show. Its integrand is analytic, except at Zoo Hence by (6) in
Sec. 14.2 we can replace Cby a small circle K of radius p and center.::o (Fig. 354), without
SEC. 14.3
Cauchy's Integral Formula
655
c
Fig. 353.
Fig. 354.
Cauchy's integral formula
o
K
Proof of Cauchy's integral formula
altering the value of the integral. Since f(~) is analytic, it is continuous (Team Project 26,
Sec. 13.3). Hence an E > 0 being given, we can find aD> 0 such that 1ft.:) - f(~o)1 < E
for all z in the disk Iz - 201 < o. Choosing the radius p of K smaller than 0, we thus have
the inequality
~
fez) - f(zo) 1 <
P
Z - 20
1
at each point of K. The length of K is 27fp. Hence, by the ML-inequality in Sec. 14.1,
l f(z~
1...
I
K
_- f_(2 0 )
d-:I-
< -E 27fp = 27fE.
P
"-.(.0
Since E (> 0) can be chosen arbitrarily small, it follows that the last integral in (2) must
have the value zero, and the theorem is proved.
•
E X AMP L E 1
Cauchy's Integral Formula
lJ ~_~ 2
c
d::.
= 2'ITie Z
I
z~2
=
2'ITie
2
= 46.4268;
for any contour enclosing ::'0 = 2 (since eZ is entire). and zero for any contour for which ::'0 = 2 lies outside (by
Cauchy's integral theorem).
•
E X AMP L E 2
Cauchy's Integral Formula
f
C
Z3 - 6
-2--. dz =
Z -
f
~Z3
-
3
--1-·
dz
C Z - 2'
I
=
2'ITi[~::.3 - 3]1
z~i/2
'IT
=
E X AMP L E 3
"8 -
6'ITi
Integration Around Different Contours
Integrate
Z2
+1
g(z) = - - Z2 -
1
::.2
+ I
----(::. + 1)(z - 1)
counterclockwise around each of the four circles in Fig. 355.
(::'0
= li inside
C) .
•
656
CHAP. 14
Complex Integration
Solution.
g(::) is not anal)1ic at -I and L These are the points we have to watch for. We consider each
circle separately.
(a) The circle
write
I:: -
11 = I encloses the point ::0 = 1 where g(;:J is not analytic. Hence in (1) we have to
;:2 + I
g(::) = - Z2 -
I
<:
+
z- 1 '
1
thus
fez) =
_2 + 1
~+I
and (I) gives
+ I
fzZ2
c
-2--
dz = 27Tif(l) = 27Ti
I
[
Z2
+ 1]
---
z+
I
= 27Ti.
z-l
lb) gives the same as (a) by the principle of deformation of path.
(c) The function glz) is as before, but fez) changes because we must take Zo = -I (instead of 1). This gives
a factor z - ~o = z + 1 in (1). Hence we must write
z-
1
z+
I '
thus
Compare this for a minute with the previous expression and then go on:
f
c
2+1
--2--
d;: = '27Tif(- I) = 27Ti
c:: - I
[-2+IJ
~
I
z~-l
= -hi
•
(d) gives O. Why?
y
x
Example 3
Multiply connected domains may be handled as in Sec. 14.2. For instance, if fez) is
analytic on C 1 and C2 and in the ring-shaped domain bounded by C 1 and C2 (Fig. 356)
and ~o is any point in that domain, then
(3)
1
-)=f( -0
2·
m
f
I-f(~)
-d- +
2·
m
~z-~
f
-fez)
- d --..
~z-~
where the outer integral (over C 1) is taken counterclockwise and the inner clockwise, as
indicated in Fig. 356.
SEC. 14.3
Cauchy's Integral Formula
657
c]
Fig. 356.
Formula (3)
Our discussion in this section has illustrated the use of Cauchy's integral formula in
integration. In the next section we show that the formula plays the key role in proving
the surprising fact that an analytic function has derivatives of all orders, which are thus
analytic functions themselves.
::a-III',--
11-41
CONTOUR INTEGRATION
Integrate
circle:
(Z2 -
1. Iz - il
+ 4) counterclockwise around the
4)/(Z2
2. Iz - 11
2
=
3. Iz + 3il = 2
l ::!.iI
4. Izl
=
=
2
71"!2
CONTOUR INTEGRATION
Using Cauchy's integral formula (and showing the details),
integrate counterclockwise (or as indicated)
,(
7
,( e- 3 r.z
13. :r - - _ dz. C the boundary of the square with
c2z + I
vertices ±1, ±i
14.
C- Iz - 11
2
=
z-
I
7.
f
sinh
rr;:;
-2--
cZ
3z
-
,(
8. :r
c
Z2 _
dz
,(
9. :r
c
Z2 -
dz,
1 .
C: Iz
,( cosz
11. :r - - dz,
c 2z
,( tanz
12. :r - - d;:;,
z-
i
1
IZ - 1I =
C:
dz
=
17.
f
+ 11
C: Izl
=
71"/2
=
4
~
C the boundary of the triangle with
vertices 0 and ± 1
+ 2i
~.
-
d:.,
C: Iz - 41 = 2
Cconsistsoflzl =3 (counterclockwise)
21Z
f
cosh2 z.
c(z-l-i)z
2
dz,
C as in Prob. 16
f
(z - Z1)-\Z - Z2)-1 dz = 0 for a simple
c
closed path C enclosing Z1 and Z2, which are arbitrary_
= 1
C- Iz - 2il
sm"
2
18. Show that
c-Izl = I
l'
10. ,( ~ dz,
:rc z - 2z
c
C- Izl
+ 1)
2
and Izl = 1 (clockwise)
C ,.
,( e 3z
6. :r -3- . dz,
(z
dz, C consists of Iz - 2il = 2
c z + 1
(counterclockwise) and Iz - 2il = ~ (clockwise)
cZ
5. :r : _ 2 dz.,
C
Ln
,( Ln (z. - 1)
15. :r
d;:;,
c
z-5
16.
+2
f
19. CAS PROJECT. Contour Integration. Experiment
to find out to what extent your CAS can do contour
integration (a) by using the second method in Sec. 14.1,
(b) by Cauchy's integral formula.
20. TEAM PROJECT. Cauchy's Integral Theorem.
Gain additional insight into the proof of Cauchy's
integral theorem by producing (2) with a contour
enclosing ;:;0 (as in Fig. 353) and taking the limit as in
the text. Choose
(b)
,( sin;:;
:r - - 1 - dz,
c z - 271"
and (c) two other examples of your choice.
658
14.4
CHAP. 14
Complex Integration
Derivatives of Analytic Functions
In this section we use Cauchy's integral formula to show the basic fact that complex
analytic functions have derivatives of all orders. This is very surprising because it differs
strikingly from the situation in real calculus. Indeed, if a real function is once
differentiable. nothing follows about the existence of second or higher detivatives. Thus.
in this respect, complex analytic functions behave much more simply than real functions
that are once differentiable.
The existence of those derivatives will result from a general integral formula, as follows.
THEOREM 1
Derivatives of an Analytic Function
If fez) is analytic in a domain D, then it has derivatives of all orders in D. which
are then also analytic functions in D. The values of these derivatives at a point Zo
in D are given by the fOl7llulas
(l ')
f
,
1
(zo)
J.
fez)
= -2
.
7TI
r
(7 _ 7 )2 dz
n'. .
-2
f
fez)
C
~
,·0
(I ")
lind in general
(1)
tn)(Zo)
=
7TI
C (z - zo)n+l
d::.
(n
= 1,2, ... );
here C is any simple closed path i11 D that encloses Zo and whose full interior belongs
to D; and we integrate counterclockwise arollnd C (Fig. 357).
Fig. 357.
Theorem 1 and its proof
COI\IMENT. For memorizing (I). it is useful to observe that these formulas are obtamed
formally by differentiating the Cauchy formula (l *), Sec. 14.3, under the integral sign
with respect to zoo
SEC. 14.4
659
Derivatives of Analytic Functions
PROOF
We prove (1 '), starting from the definition of the delivative
f '(~~'0) -On the right we represent f(zo
f(zo
+
+
I'
f(zo
1m
+
"'2-->0
.6.z) - f(~)
.6.z
.6.z) and f(zo) by Cauchy's integral formula:
.6.z) - f(zo) = _1_
.6.z
27Ti.6.z
[1
Jc z
fez)
dz - (Zo + .6.z)
1~
Jc z
dZ] .
- Zo
We now write the two integrals as a single integral. Taking the common denominator
gives the numerator f(z){z - Zo - [z - (zo + ilz)]} = fez) .6.z, so that a factor ilz drops
out and we get
f(zo
+
.6.z) - f(zo) = _1_
.6.z
27Ti
1
fez)
dz .
- Zo - .6.z)(z - zo)
Jc (z
Clearly, we can now establish (1') by showing that, as .6.z ---'> 0, the integral on the right
approaches the integral in (1 '). To do this, we consider the difference between these two
integrals. We can write this difference as a single integral by taking the common
denominator and simplifying the numerator (as just before). This gives
fc _Zo
(z
!(z)_
_
dz .6..:.)(z
zo)
fc
(z
~z:
2
dz =
':'0)
fc
f(z).6.z
2 dz.
(z - Zo - .6.z)(z - zo)
We show by the ML-inequality (Sec. 14.1) that the integral on the right approaches zero
as .6.z ---'> O.
Being analytic, the function fez) is continuous on C, hence bounded in absolute value,
say, If(z)1 ~ K. Let d be the smallest distance from Zo to the points of C (see Fig. 357).
Then for all z on C,
17- 712>= d 2 ,
.....
.....0
hence
Furthermore, by the triangle inequality for all
d ~ Iz - zol = Iz - Zo - .6.;::
z on C we then also have
+ .6.21
~ Iz - Zo - .6.zl
+ l.6.zl·
We now subtract lilzl on both sides and let l.6.zl ~ dl2, so that -lilzl ~ -d/2. Then
id ~ d -
l.6.zl ~ Iz - Zo - .6.zl·
Hence
2
-:---------,- ::; -
Iz - Zo - .6.zl
Let L be the length of C. If l.6.zl ~ dl2, then by the ML-inequality
d
660
CHAP. 14
Complex Integration
This approaches zero as !:::..Z ---,> O. Formula (1 ') is proved.
Note that we used Cauchy's integral formula (1 *), Sec. 14.3, but if all we had known
about f(zo) is the fact that it can be represented by (1 *), Sec. 14.3, our argument would
have established the existence of the derivative t' (zo) of fez). This is essential to the
continuation and completion of this proof, because it implies that (1") can be proved by
a similar argument, with f replaced by f', and that the general formula (1) follows by
induction.
•
E X AMP L E 1
Evaluation of Line Integrals
From (1 '), for any contour enclosing the point 71"i (counterclockwise)
f
_c_o_s2-.-:0 dz = 271"i(cos
c (z - m) 2
E X AMP L E 2
•
= -271"i sin 71"i = 271" sinh 71"
z~.".i
From (1 "), for any contour enclosing the point - i we obtain by counterclockwise integration
f.
rc
E X AMP L E 3
Z)'I
2
Z4 -
(Z
3z
+
+6
·3
I)
dz
=
2"1
4
71"i(z - 3z + 6)
.
=
z~-,
71"i[12z
2-
61z~-i
= -1871"i.
•
By (1'), for any contour for which 1 lies inside and ±2i lie out~ide (counterclockwise),
f
:z
c (z - 1) (z
2
+ 4)
dz =
271"i(-/-)'
I
+
z
4
z
. e (z2
= 271"1
+ 4)
2
(z
+
z~l
Z
- e 2z
2
I
z~l
4)
6e71" .
.
= - - 1 = 2.0501.
25
•
Cauchy's Inequality. Liouville's and Morera's Theorems
As a new aspect, let us now show that Cauchy's integral theorem is also fundamental in
deliving general results on analytic functions.
Cauchy's Inequality. Theorem I yields a basic inequality that has many applications.
To get it, all we have to do is to choose for C in (1) a circle of radius r and center Zo and
apply the ML-inequality (Sec. 14.1); with If(z)1 ~ M on C we obtain from (I)
7 1 -f (n) Co)
1
If.
~
2
r
7T
(7 _
c~
fez)
Zo
71
)n+l d_
:5=
This gives Cauchy'S inequality
(2)
To gain a first impression of the importance of this inequality, let us prove a famous
theorem on entire functions (definition in Sec. 13.5). (For Liouville, see Sec. 5.7.)
SEC. 14.4
661
Derivatives of Analytic Functions
THEOREM 2
Liouville's Theorem
If an entire function is bounded in absolute value in the whole complex plane, then
this function must be a constant.
PROOF
By assumption, If(z)1 is bounded, say, If(z)1 < K for all z. Using (2), we see that
If' (zo)1 < Klr. Since fez) is entire, this holds for every r, so that we can take r as large
as we please and conclude that f' (zo) = O. Since Zo is arbitrary, f' (z) = Ux + ivx = 0
for all z (see (4) in Sec. 13.4), hence U x = Vx = 0, and uy = Vy = 0 by the Cauchy-Riemann
equations. Thus u = const, v = const, and f = u + iv = const for all z. This completes
.
~~
Another very interesting consequence of Theorem 1 is
THEOREM 3
Morera's2 Theorem (Converse of Cauchy's Integral Theorem)
If fez) is continuous in a simply connected domain D and if
f
(3)
fez) dz = 0
c
for every closed path in D, then fez) is analytic in D.
PROOF
In Sec. 14.2 we showed that if fez) is analytic in a simply connected domain D. then
F(z) =
r
f(z*) dz*
Zo
is analytic in D and F' (z) = fez). In the proof we used only the continuity of fez) and the
property that its integral around every closed path in D is zero; from these assumptions
we concluded that F(z) is analytic. By Theorem 1, the derivative of F(z) is analytic, that
•
is, fez) is analytic in D, and Morera's theorem is proved.
11-81
CONTOUR INTEGRATION
Imegrate counterclockwise around the circle Izl = 2. (n is
a positive integer, a is arbitrary.) Show the details of your
work.
1.
cosh 3z
2.
sin z
(z -
7fil2)
3.
5.
e Z cos
(z -
Z
7f12)2
sinh az
Z4
cos z
4. ?n+l
6.
Zn
4
7.
(z - a)n+l
8.
Ln (z
+ 3) + cos z
+ 1)2
(z
eZ
(z -
a)n
2GlACINTO MORERA (1856-190Y), Italian mathematician who worked in Genoa and Turin.
662
CHAP. 14
[9-131
Complex Integration
INTEGRATION AROUND DIFFERENT
CONTOURS
14. TEAM PROJECT. Theory on Growth
(a) Growth of entire functions. If fez) is not a
constant and is analytic for all (finite) z, and Rand M
are any positive real numbers (no matter how large),
show that there exist values of z for which Izl > Rand
If(z)1 > M.
(b) Growth of polynomials. If fez) is a polynomial
of degree n > 0 and M is an arbitrary positive real
number (no matter how large), show that there exists
a positive real number R such that If(z)1 > M for all
Izl >R.
(c) Exponential function. Show that fez) = e Z has
the property characterized in (a) but does not have that
characterized in (b).
Integrate around C. Show the details.
9.
10.
+ 2:::) cosz
(I
2'
(2:: - 1)
sin 4z
3
(z - 4 )
and
'
I:: - 31
tan 7rZ
::
C the unit circle. counterclockwise
C consists of 1.:1
= ~
(clockwise)
C the ellipse 16 x 2
11.
--2-'
12.
e 2z
, C consists of Iz
z(:: - 2i)2
and
Izl
=
+ .v2
=
- i I=
L counterclockwise
3 (counterclockwise)
(d) Fundamental theorem of algebra. rf fez) is a
po/ynDmial in z, IUlt a constant, then fez) = 0 for at
least one value (If z. Prove this, using (a).
C the circle I:: - 2 - i I = 3, counterclockwise
15. (Proof of Theorem 1) Complete the proof of Theorem
1 by performing the induction mentioned at the end.
1 (clockwise)
ez / 2
13. (__, _ a)
5 (counterclockwise)
4'
o .- S T ION SAN D PRO B L EMS
J
J
1. What is a path of integration? What did we assume
about paths?
13. Is Re
2. State the definition of a complex line integral from
memory.
14. How did we use integral formulas for derivatives in
integration?
3. What do we mean by saying that complex integration
is a linear operation?
15. What is Liouville's theorem? Give examples. State
consequences.
4. Make a list of integration methods discussed. lllustrate
each with a simple example.
116-301
5. Which integration methods apply to analytic functions
only?
16. 4z 3
6. What value do you get if you integrate liz
counterclockwise around the unit circle? (You should
memorize this basic result.) If you integrate liz 2,
1/z3, . . . ?
IS. :: + liz counterclockwise around I:: -I 3il = 2
19. e 2z from -2 + 37ri along the straight segment to
7. Which theorem in this chapter do you regard as most
important? State it from memory.
S. What is independence of path? What is the principle of
deformation of path? Why is this important?
9. Do not confuse Cauchy's integral theorem and Cauchy's
integral formula. State both. How are they related?
10. How can you extend Cauchy's integral theorem to
doubly and triply connected domains?
11. If integrating fez) over the boundary circles of an
annulus D gives different values, can fez) be analytic
in D? (Give reason.)
12. Is
IfJ(Z) I
dz
=
fel f(::)1 dz? How would you find a
bound for the integral on the left?
fez) dz
e
=
Re fez) dz? Give examples.
e
INTEGRATION
Integrate by a suitable method:
+ 2z from
- i to 2
+i
along any path
17. 5z - 3/z counterclockwise around the unit circle
-2
20.
21.
22.
23.
24.
e
z2
+ 57ri
/(z - 1)2 counterclockwise around
Izi
=
2
z1(z2 + 1) clockwise around Iz + il = 1
Re:: from 0 to 4 and then vertically up to 4 + 3i
cosh 4z from 0 to 2i along the imaginary axis
eZ/z over C consisting of 1.::1 = I (counterclockwise) and
Izl = ! (clockwise)
25. (sin z)/z clockwise around a circle containing z = 0 in
its interior
26. 1m z counterclockwi~e around /:::1 = r
27. (Ln z)/(z - 20 2 counterclockwise around /z - 2i/ = 1
2S. (tan 7r:::)/(z - 1)2 counterclockwise around /z - 1/ = 0.2
29. Izi + z clockwise around the unit circle
30. (z - i)-3(Z3 + sin z) counterclockwise around any
circle with center i
663
Summary of Chapter 14
Complex Integration
The complex line integral of a function fez) taken over a path C is denoted by
(1)
J
f
or. if C is closed. also by
fez) dz
c
(Sec. 14.1).
fez)
c
If fez) is analytic in a simply connected domain D, then we can evaluate (1) as in
calculus by indefinite integration and substitution of limits, that is,
J
(2)
fez) dz
=
[F' (z) = fez)]
F(z1) - F(zo)
c
for every path C in D from a point Zo to a point Z1 (see Sec. 14.1). These assumptions
imply independence of path, that is, (2) depends only on Zo and Z1 (and on fez),
of course) but not on the choice of C (Sec. 14.2). The existence of an F(z) such:that
F' (z) = fez) is proved in Sec. 14.2 by Cauchy's integral theorem (see below).i
A general method of integration, not restricted to analytic functions, uses the
equation z = z(t) of C, where a ~ t ~ b,
J
(3)
J
b
fez) dz =
c
f(z(t))z(t) dr
z = dZ) .
0
(
a
dl
Cauchy's integral theorem is the most important theorem in this chapter. It states
that if fez) is analytic in a simply connected domain D, then for every closed path
C in D (Sec. 14.2),
f
(4)
fez) dz
=
c
o.
Under the same assumptions and for any Zo in D and closed path C in D containing
<:0 in its interior we also have Cauchy's integral formula
1 ,(
f(zo) = - 2.
(5)
7T'1
fez)
r- dz.
Z - Zo
C
Furthermore, under these assumptions fez) has derivatives of all orders in D that
are themselves analytic functions in D and (Sec. 14.4)
(6)
(n)
f
_
n! ~
(zo) - - 2.
7T'1
fez)
rC ( __ )n+1
Z
dz
(n
= 1,2.· .. ).
'-0
This implies Morera's theorem (the converse of Cauchy's integral theorem) and
Cauchy's inequality (Sec. 14.4), which in turn implies Liouville's theorem that an
entire function that is bounded in the whole complex plane must be constant.
CHAPTER
15
Power Series, Taylor Series
Complex power series, in particular, Taylor series, are analogs of real power and Taylor
series in calculus. However, they are much more fundamental in complex analysis than
their real counterparts in calculus. The reason is that power series represent analytic
functions (Sec. 15.3) and, conversely, every analytic function can be represented by power
series, called Taylor series (Sec. 15.4).
Use Sec. 15.1 for reference if you are familiar with convergence tests for real seriesin complex this is quite similar. The last section (15 .5) on uniform convergence is optional.
Prerequisite: Chaps. 13, 14.
Sections thar may be omitted in a shorter course: 14.1, 14.5.
References and Answers 10 Problems: App. I Part D, App. 2.
15.1
Sequences, Series, Convergence Tests
In this section we define the basic concepts for complex sequences and series and discuss
tests for convergence and divergence. This is very similar to real sequences and series in
calculus. If you feel at home with the latter and want to take for granted that the ratio
test also holds ill complex, skip this section and go to Sec. 15.2.
Sequences
The basic definitions are as in calculus. An ii!finite sequence or, briefly, a sequence, is
obtained by assigning to each positive integer 11 a number Zn, called a term of the sequence,
and is wlitten
or
ZI' 22, •••
or briefly
We may also write 20, Z1, ••• or :::2, :::3, ••• or start with some other integer if convenient.
A real sequence is one whose terms are real.
Convergence.
A convergent sequence
lim
Zn =
C
21. Z2, •••
or simply
n~oo
By definition of limit this means that for every
(1)
664
Izn - cl
E
<
is one that
Zn~
ha~
a limit c, written
c.
> 0 we can find an N such that
E
for all
11
> N;
SEC. 15.1
665
Sequences, Series, Convergence Tests
geometrically, all terms Zn with n > N lie in the open disk of radius E and center c
(Fig. 358) and only finitely many terms do not lie in that disk. [For a real sequence, (1)
gives an open interval of length 2E and real midpoint c on the real line; see Fig. 359.]
A divergent sequence is one that does not converge.
y
:
x
Fig. 358.
E X AMP L E 1
C-E
Convergent complex sequence
C +E
C
Fig. 359.
Convergent and Divergent Sequences
The sequence {inln} = Ii, -112, -;/3, 114, ... } is convergent with limit O.
The sequence {in} = [i. -I. -i. I .... } is divergent. and so is {zn} with zn = (1
E X AMP L E 2
x
Convergent real sequence
+
On.
•
Sequences of the Real and the Imaginary Parts
The sequence [zn) with zn = xn + iYn = I - Iln 2 + ;(2 + 41n) is 6i, 3/4 + 4i, 8/9 + 1Oi/3, 15/16 + 3i, .
(Sketch it.) It converges with the limit c = I + 2;. Observe that {x",} has the limit 1 = Re c and {Yn} has the
limit 2 = 1m c. This is typical. It illustrates the following theorem by which the convergence of a complex
sequence can be referred back to that of the two real sequences of the real parts and the imaginary parts. •
THEOREM 1
Sequences of the Real and the Imaginary Parts
A sequence Z1, Z2, ... , Zn, . .. of complex numbers Zn = Xn + iYn (where
n = 1, 2, ... ) converges to c = a + ib if and only if the sequence of the real parts
Xl> X2 , • • • converges to a and the sequence of the imaginary parts Yl> )'2' . . .
converges to b.
PROOF
Convergence zn ~ C = a + ib implies convergence Xn ~ a and Yn ~ b because if
IZn - cl < E, then Zn lies within the circle of radius E about c = a + ib, so that
(Fig. 360a)
IYn - bl <
y
E.
y
": -!l~
~
b+~
b
b-~
b-E
I
:
I
a-E
a
a+E
x
-@
I
:
I
Eft
a
\
a-:2
Cal
Fig. 360.
E
a+:2
Chl
Proof of Theorem 1
x
666
CHAP. 15
Power Series, Taylor Series
Conversely. if Xn ~ a and Yn
so large that, for every II > N,
Ix n
~
b as n ~
x.
E
-
al <-2'
then for a given
I.vn
-
E
> 0 we can choose N
E
bl <
-2 .
These two inequalities imply that Zn = xn + iYn lies in a square with center c and side
E. Hence, zn must lie within a circle of radius E with center c (Fig. 360b).
•
Series
Given a sequence
we may form the sequence of the sums
Z10 Z2, . . . , ::m, ... ,
and in general
(11
(2)
sn is called the nth partial sum of the
i/~fillite
= 1. 2 ... ').
series or series
cc
2:
(3)
Zm
=
Z1
+ Z2 +
m~1
The Z10 Z2, • . . are called the terms of the series. (Our usual summation letter is 11,
unless we need 11 for another purpose, as here, and we then use m as the summation
letter.)
A convergent series is one whose sequence of partial sums converges. say,
x
lim
n--.oo
Sn = S.
Then we write
S
=
2:
::m =
ZI
+ 22 +
m~1
and call s the sum or value of the series. A series that is not convergent is called a divergent
series.
If we omit the terms of sn from (3), there remains
(4)
Rn
=
Zn+1
+ Zn+2 + Zn+3 +
This is called the remainder o/the series (3) after the term
and has the sum s, then
Zn'
Clearly, if (3) converges
thus
Now Sn ~ S by the definition of convergence; hence Rn ~ O. In applications, when s is
unknown and we compute an approximation Sn of s, then IRnl is the error, and Rn ~ 0
means that we can make /Rn/ as small as we please, by choosing 11 large enough.
An application of Theorem I to the partial sums immediately relates the convergence
of a complex series to that of the two series of its real parts and of its imaginary parts:
SEC. 15.1
667
Sequences, Series, Convergence Tests
THEOREM 2
Real and Imaginary Parts
A series (3) with Zm = Xm + iYm converges and has the sum s = u + iv if and only
ifx] + X2 + ... converges and has the sum u and Yt + Y2 + ... converges and
has the sum v.
Tests for Convergence and Divergence of Series
Convergence tests in complex are practically the same as in calculus. We apply them
before we use a series, to make sure that the series converges.
Divergence can often be shown very simply as follows.
THEOREM 3
Divergence
if a series Zl + Z2 + ...
converges, then lim Zm. = O. Hence if this does not hold,
m,~oo
the series diverges.
PROOF
[f Zl
+
Z2
+ ...
lim
rrz,--...:..x
converges, with the sum s, then, since Zn, =
Zm
= 71llim
(sm
__ 00
-
Sm-1)
= 'H1lim
__
(X)
Sm -
lim
Sm-1
-1'11-----"'00
Sm -
=
S -
Sm-1'
S
= o.
•
Zm ~ 0 is necessary for convergence but not sufficient, as we see from the
harmonic series I + ! + ~ + f + ... , which satisfies this condition but diverges, as is
shown in calculus (see, for example, Ref. [GRll] in App. I).
CAUTION!
The practical difficulty in proving convergence is that in most cases the sum of a series
is unknown. Cauchy overcame this by showing that a series converges if and only if its
partial sums eventually get close to each other:
THEOREM 4
Cauchy's Convergence Principle for Series
A series Zl + Z2 + ... is convergent if and only iffor every given E> 0 (no matter
how small) we can find an N (which depends on E, in general) such that
(5)
Izn+1
+
Zn+2
+ ... +
Zn+pl
<
E
for every n > Nand p = 1. 2, ...
The somewhat involved proof is left optional (see App. 4).
A series Zl + Z2
series of the absolute values of the terms
Absolute Convergence.
+ ... is called absolutely convergent if the
co
is convergent.
If Zl + Z2 + ... converges but 1z11 + IZ21 + ... diverges, then the series
is called, more precisely, conditionally convergent.
Zl
+ Z2 + ...
CHAP. 15
668
E X AMP L E 3
Power Series, Taylor Series
A Conditionally Convergent Series
The series I - i + ! - ! + - ... converges. but only conditionally since the harmonic series diverges, as
mentioned above (after Theorem 3).
•
If a series
is absolutely convergent, it is convergent.
This follows readily from Cauchy's principle (see Team Project 30), This principle also
yields the following general convergence test.
THEOREM 5
Comparison Test
If a series::l + Z2 + ... is given and we C([nfind a convergent series b l + b2 + ...
IZII
with nonnegative real fenns such that
converges, even absolutely.
PROOF
By Cauchy's principle, since b l
an N such that
+
b2
~
+ ...
bI> IZ21
~ b2 ,
.•• ,
converges, for any given
for every n > Nand p
From this and
IZII ;0; bI> 1z21
~ b2 •
.•.
Hence, again by Cauchy's principle,
is absolutely convergent.
then the given series
=
E
> 0 we can find
1,2, ....
we conclude that for those n and p,
Izil
+ IZ21 + ...
converges, so that
Zl + Z2 +
•
A good comparison series is the geometric series. which behaves as follows.
THEOREM 6
Geometric Series
The geometric series
00
2,
(6*)
qm = I
+
q
+
q2
+ ...
m=O
converges with the sum I/O - q)
PROOF
if Iql <
I and diverges
if Iql
~ 1.
If Iql ~ 1. then !qml ~ I and Theorem 3 implies divergence.
Now let Iql < l. The nth partial sum is
sn
=
I
+
q
+
From this,
On subtraction, most terms on the right cancel in pairs, and we are left with
SEC. 15.1
669
Sequences, Series, Convergence Tests
Now 1 - q
=1=
0 since q
=1=
1, and we may solve for
S/l.'
finding
1 - qn+l
(6)
1 - q
l-q
l-q
Since Iql < L the last tenn approaches zero as II ~ rx. Hence if
convergent and has the sum 11(1 - q). This completes the proof.
Iql
< L the series is
•
Ratio Test
This is the most important test in our further work. We get it by taking the geometric
series as comparison series b 1 + b 2 + ... in Theorem 5:
THEOREM 7
Ratio Test
<:1 + <:2 + ... with Zn
eve,)" n greater than some N,
If a series
=1=
0 (/1
I, 2, ... ) has the property that for
I Z~:1 I ~ q <
(7)
(n
I
(where q < I is fixed), this series cunverges absolutely.
If for
N)
every n > N,
IZ::1 I ~ 1
(8)
>
(n
>
N),
the series diverges.
PROOF
If (8) holds. then IZn+ll ~ Iznl for n > N, so that divergence of the series follows from
Theorem 3.
If (7) holds, then Izn+ll ~ IZnl q for Il > N, in particular,
etc.,
and in general,
IZN+pl
~ IzN+llqP-l. Since q
Absolute convergence of Zl
+ <:2 + ...
< 1, we obtain from this and Theorem 6
now follows from Theorem 5.
•
CAUTION! The inequality (7) implies IZn+llznl < 1, but this does not imply
convergence, as we see from the harmonic series, which satisfies ::'n+llzn = n/(n + 1) < I
for alln but diverges.
If the sequence of the ratios in (7) and (8) converges, we get the more convenient
670
CHAP. 1S
THEOREM 8
Power Series, Taylor Series
Ratio Test
if a series Z1 + Z2 + ... with Zn
~
* 0 (n = 1,2, ... ) is such that 2.! I
Zn+1
(a)
If L < 1, the series converges absolutely.
(b)
if L >
~
I=
L,
1, the series diverges.
(c) {f L = 1, the series may converge or diverge, so that the test fails and
permits no conclusion.
PROOF
(a) We write k n = IZn+1/2nl and let L = I - b < 1. Then by the definition of limit, the
kn must eventually get close to 1 - b, say, kn ~ q = 1 - ~b < 1 for alln greater than
some N. Convergence of 21 + Z2 + ... now follows from Theorem 7.
(b) Similarly, for L = I + c> 1 we have k n ;:::; I + ~c > I for alln > N* (sufficiently
large), which implies divergence of Z1 + Z2 + ... by Theorem 7.
(c) The harmonic series I
~
+
+
~
+ ...
has
Zn+1/Zn
=
11/(11
+
I), hence L
= 1, and
diverges. The series
1+
4
hence also L
1
+
+
9
16
+
1
25
+ ...
Zn+1
has
= 1, but it converges. Convergence follows from (Fig. 361)
Sn
I
4
I
= I + - + ... + - 2
n
~
fndX
-
1+
1
=2- - ,
n
1 X2
so that S10 S2, ••• is a bounded sequence and is monotone increasing (since the terms of
the series are all positive); both properties together are sufficient for the convergence of
the real sequence S10 S2, • • • . (In calculus this is proved by the so-called integral test.
whose idea we have used.)
•
y
\
Area
1
o
2
Convergence of the series 1 +
Fig. 361.
E X AMP L E 4
4
3
t + i + k + ...
Ratio Test
Is the following seIies convergent or divergent? (First guess, then calculate.)
L
n=O
(100 + 75i)n
n!
=
1 + (100 + 75;)
+ -
I
2!
(100
+
75;)2
+ ...
x
SEC. 15.1
671
Sequences, Series, Convergence Tests
Solution.
By Theorem 8, the series is convergent, since
Zn+l
I
E X AMP L E S
I
;100
=
+ 75il n +l/(n +
I)! = 1100
1100 + 75il n /11!
Zn
+ 75il
+
11
125
1
11
+
•
L = O.
1
Theorem 7 More General than Theorem 8
Let an = il2 3n and bn
ao
=
lI23n + 1. Is the following selies convergent or divergent?
+ b o + al + b 1 + ...
= i
+
1
2
+
iIi
64
"8 + 16 +
I
+ 128 + ...
Solution.
The ratios of the absolute values of successive terms are!,!,!,!, .... Hence convergence follows
from Theorem 7. Since the sequence of these ratios has no limit, Theorem 8 is not applicable.
•
Root Test
The ratio test and the root test are the two practically most important tests. The ratio test
is usually simpler, but the root test is somewhat more general.
THEOREM 9
Root Test
If a
series ::1
+
Z2
+ ...
is such that for every n greater thall some N,
(9)
(n
(where q < I is .fixed), this series converges absolutely.
If for
>
N)
infinitely mall)' n,
(10)
the series diverges.
PROOF
If (9) holds, then Iznl ~ qn < I for all n > N. Hence the series 1:::11 + IZ21 + ... converges
by comparison with the geometric series, so that the series ZI + Z2 + ... converges
absolutely. If (10) holds, then Iznl ~ 1 for infinitely many n. Divergence Of::l +::2 + ...
now follows from Theorem 3.
•
Vfz:J
CAUTION! Equation (9) implies
< 1, but this does not imply convergence, as
we see from the harmonic series, which satisfies ~ < I (for n > 1) but diverges.
If the sequence of the roots in (9) and (10) converges, we more conveniently have
THEOREM 10
Root Test
If a
series
ZI
+
Z2
+ ...
is such that lim
r~co
(a) The series converges absolute(..'
if L > 1.
ff L = 1, the test fails; that is,
Vfz:J =
if L <
L, then:
1.
(b) The series diverges
(c)
no conclusion is possible.
672
CHAP. 15
PROOF
Power Series, Taylor Series
The proof parallels that of Theorem 8.
(a) Let L = 1 - a* < 1. Then by the definition of a limit we have
~ < q = 1 - ~a* < 1 for all n greater than some (sufficiently large) N*. Hence
IZnl < qn < I for all n > N*. Absolute convergence of the series ZI + Z2 + ... now
follows by the comparison with the geometric series.
VfzJ
(b) If L > 1, then we also have
> I for all sufficiently large n. Hence
for those n. Theorem 3 now implies that ZI -f;, Z2 + ... diverges.
1
(c) Both the divergent harmonic series and the convergent series
t 116 + 2~ + .. give L = 1. This can be seen from (In n)/n ---7 0 and
SEQUENCES
11. Illustrate Theorem 1 by an example of your own.
12. (Uniqueness of limit) Show that if a sequence
converges. its limit is unique.
13. (Addition) If ZI, Z2, ... converges with the limit [and
ZI *, Z2 *, ... converges with the limit [*, show that
Zl + Zl *':::2 + Z2*' ... converges with the limit [ + [*.
14. (Multiplication) Show that under the assumptions of
Prob. L3 the sequence ZlZ1*' Z2Z2*' ... converges
with the limit U*.
15. (Boundedness) Show that a complex sequence is
bounded if and onl y if the two corresponding sequences
of the real parts and of the imaginary parts are bounded.
SERIES
Are the following series convergent or divergent? (Give a
reason.)
(10 - ISi)n
16.2:---n!
n=O
rye
18.
:L
n=O
·n
1
-2-- - .
n - 21
20.2:
n=2 In n
o .
e(2/nHn n
CC
Are the following sequences Zl, Z2, ... , Zn> ... bounded?
Convergent? Find their limit points. (Show the details of
your work.)
2. Zn = e- nwi / 4
1. Zn = (_l)n + il2"
3. Zn = (-1)n/(n + i)
4. Zn = (I + i)n
5. Zn = Ln «2 + i)n)
6. Zn = (3 + 4i)n/n!
7. Zn = sin (n'1T/4) + in
8. Zn = [(1 + 3i)rVioT
9. Zn = (0.9 + 0.li)2n
10. Zn = tS + Si)-n
116-241
>
+ ~ +!
e
11-101
IZnl
cc
17. ~o
CC
19.2:
n~l
(-I)n(1
(2n
I
Vn
n
+ 2i)2n+l
+ I)!
(n 1)3
22. 2: - ' - (1
n~O (3n)!
+
on
e
o .
•
n - i
23.2:
n=O
3n
+
2i
25. What is the difference between (7) and just stating
IZn+l/Znl
<
I?
26. Illustrate Theorem 2 by an example of your choice.
27. For what n do we obtain the term of greatest absolute
value of the series in Example 4? About how big is it?
First guess, then calculate it by the Stirling formula in
Sec. 24.4.
28. Give another example showing that Theorem 7 is more
general than Theorem 8.
29. CAS PROJECT. Sequences and Series. (a) Write a
program for graphing complex sequences. Apply it to
sequences of your choice that have interesting
"geometrical" properties (e.g., lying on an ellipse,
spiraling toward its limit, etc.).
(b) Write a program for computing and graphing
numeric values of the first n partial sums of a series
of complex numbers. Use the program to experiment
with the rapidity of convergence of series of your
choice.
30. TEAM PROJECT. Series. ta) Absolute convergence.
Show that if a series converges absolutely, it is
convergent.
(b) Write a short report on the basic concepts and
properties of series of numbers, explaining in each case
whether or not they carry over from real series
(discussed in calculus) to complex series, with reasons
given.
SEC. 15.2
673
Power Series
(c) Estimate of the remainder. Let Izn+llznl ~ q < 1,
so that the series Zl + Z2 + ... converges by the ratio
test. Show that the remainder Rn = Zn+l + Zn+2 + ...
satisfies the inequality IRnl ~ Izn+ll/(I - q).
(d) Using (c), find how many terms suffice for
computing the sum s of the series
15.2
CC
2:
n=l
n + i
2nn
with an error not exceeding 0.05 and compute s to this
accuracy.
(e) Find other applications of the estimate in (c)
Power Series
Power series are the most important series in complex analysis because we shall see that
their sums are analytic functions, and every analytic function can be represented by power
series (Theorem 5 in Sec. 15.3 and Theorem 1 in Sec. 15.4).
A power series in powers of z - Zo is a series of the form
(1)
2:
an(z - zo)n
=
ao
+
al(z - zo)
+
a2(Z - zoi
+
n~O
where z is a complex variable, ao, ab . . . are complex (or real) constants, called the
coefficients of the series, and zo is a complex (or real) constant, called the center of the
series. This generalizes real power series of calculus.
If Zo = 0, we obtain as a particular case a power series in powers of z:
GO
(2)
2:
anz
n
=
+
ao
alZ
+
a2z2
+
n~O
Convergence Behavior of Power Series
Power series have variable terms (functions of z), but if we fix z, then all the concepts
for series with constant terms in the last section apply. Usually a series with variable
terms will converge for some z and diverge for others. For a power series the situation is
simple. The series (l) may converge in a disk with center Zo or in the whole z-plane or
only at zo0 We illustrate this with typical examples and then prove it.
E X AMP L E 1
Convergence in a Disk. Geometric Series
The geometric series
converges absolutely if Izl < I and diverges if Izl
E X AMP L E 1
~ 1
(see Theorem 6 in Sec. 15.1).
Convergence for Every z
The power series (which will be the Maclaurin series of eZ in Sec. 15.4)
00
zn
L -,
n.
n=O
,2
Z3
=1+z+-+-+'"
2!
3!
•
674
CHAP.15
is
Power Series, Taylor Series
ab~olutely
convergent for every
~,
In fact. by the ratio test. for any fixed ::.
•
as
E X AMP L E 3
Convergence Only at the Center. (Useless Series)
The following power series converges only at z = 0, but diverges for every
z *" 0, as we shall show.
00
L
n!zn = 1
+
Z
I)
Izl
+ 2::2 + 6;:3 + ...
n=O
In fact, from the ratio test we have
(II + I)!zn+l
---n-n!z
I
THEOREM 1
I
+
= (n
x
-4
as
(;: fixed and
*" 0).
•
Convergence of a Power Series
(a) Every power series (1) converges at the center Zo.
rr
(b)
(1) converges at a point Z = Zl *- zo, it converges absolutely for every
closer to Zo than Zl, that is, Iz - zol < IZI - zol. See Fig. 362.
(C) If (1) diverges at a z
than Z2' See Fig. 362.
Z
Z2, it diverges for every zfarther away from Zo
=
y
, /'
I
; ' - - ....
I
I
\
----- .... "
,
,Divergent
qz] \
Cony.
\
0
I
" ..... _-""
Zo,'
\
I
,~Z2
,
' .... _----'
Fig. 362.
PROOF
(a) For z
:x:
Theroem 1
= Zo the series reduces to the single tenn ao.
(b) Convergence at z = Zl gives by Theorem 3 in Sec. 15.1 an(ZI - zo)n ~ 0 as n ~
This implies boundedness in absolute value,
for every n
Multiplying and dividing an(z - Zo)n by
(Zl -
= 0,
1.....
zo)n we obtain from this
00.
SEC. 15.2
675
Power Series
Summation over
Il
gives
(3)
Now our assumption Iz - 201 < 1:1 - ~ol implies that I(z - z.o)/(Zl - zo)1 < 1. Hence the
series on the right side of (3) is a converging geometric series (see Theorem 6 in
Sec. 15.1). Absolute convergence of (1) as stated in (b) now follows by the comparison
test in Sec. 15.1.
(c) If this were false, we would have convergence at a Z3 farther away from Zo than Z2'
This would imply convergence at Z2, by (b), a contradiction to our assumption of
divergence at Z2'
•
Radius of Convergence of a Power Series
Convergence for every z (the nicest case, Example 2) or for no z *- Zo (the useless case,
Example 3) needs no further discussion, and we put these cases aside for a moment. We
consider the smallest circle with center Zo that includes all the points at which a given
power series (I) converges. Let R denote its radius. The circle
Iz - zol
=R
(Fig. 363)
is called the circle of convergence and its radius R the radius of convergence of (l).
Theorem I then implies convergence everywhere within that circle, that is, for all z for
which
(4)
Iz - zol < R
(the open disk with center:o and radius R). Also. since R is as small as possible. the series
(l) diverges for all z for which
Iz - ~ol > R.
(5)
No general statements can be made about the convergence of a power series (1) on the
circle of convergence itself. The series (I) may converge at ~ome or all or none of these
points. Details will not be essential to us. Hence a simple example may just give us the
idea.
DiVergent
C3
co::;;r
Zo
Fig. 363.
Circle of convergence
CHAP.15
676
E X AMP L E 4
Power Series, Taylor Series
Behavior on the Circle of Convergence
On the circle of convergence (radius R
L ~ nln
L
2
~nl"
=
1 in all three series),
converges everywhere since L 1/,,2 converges.
converges at -1 (by Leibniz's test) but diverges at 1,
•
diverges everywhere.
=
Notations R
oc and R
notation, we write
R
=
x
= O.
To incorporate these two excluded cases in the present
z (as in Example 2),
converges only at the center z = ~o (as in Example 3).
if the series 0) converges for all
R = 0 if (1)
These are convenient notations, but nothing else.
Real Power Series. In this case in which powers, coefficients. and center are real.
formula (4) gives the convergence interval Ix - xol < R of length 2R on the real line.
Determination of the Radius of Convergence from the Coefficients.
important practical task we can use
THEOREM 1
For this
Radius of Convergence R
Suppose that the sequence lan +1lan l, n = 1. 2, ... , converges with limit L *. !f
L * = 0, thell R = x; that is, the power series (1) converges for all ~. If L *
0
(hence L * > 0), then
*"
(6)
R=
L*
I~I
= n~CXl
lim
an+l
(Cauchy-Hadamard formula l ).
If lan+l/anl ~ x, then R = 0 (convergence only at the center 20).
PROOF
For (1) the ratio of the terms in the ratio test (Sec. 15.1) is
The limit is
*"
L
=
L*lz -
zol.
:01
Let L* 0, thus L* > O. We have convergence if L = L*lz < 1, thus Iz - zol < IIL*,
and divergence if k - !ol > IIL*. By (4) and (5) this shows that IIL* is the convergence
radius and proves (6).
If L * = 0, then L = 0 for every z, which gives convergence for all z by the ratio test.
If lan+l/anl ~ co, then lan+l/anllz - zol > I for any z
Zo and all sufficiently large n.
This implies divergence for all z
Zo by the ratio test (Theorem 7, Sec. 15.1).
•
*"
*"
INamed after the French mathematicians A. L. CAUCHY (see Sec. 2.5) and JACQUES HADAMARD
(1865-1963). Hadamard made basic contributions to the theory of power series and devoted his lifework to
partial differential equations.
SEC. 15.2
677
Power Series
Formula (6) will not help if L * does not exist, but extensions of Theorem 2 are still
possible, as we discuss in Example 6 below.
E X AMP L E 5
Radius of Convergence
By (6) the radIUs of convergence of the power ~eries
R
=
(211)!
lim
-n~'X [ (1I!)2
I
(2n
«11
+ 2)! ]
+ 1)!)2
=
The series converges in the open disk
E X AMP L E 6
(211)!
- - 2 (z -
,,~O (n!)
3i)n is
[(211)!
«n + 1)!)2 ]
.
(2n + 2)!
(1I!)2
lim
n~oc
Iz -
oc
L
=
+ 1)2
1
+ 2)(211 + 1)
4
(11
lim
n~x (211
•
3;1 < ! of radius! and center 3;.
Extension of Theorem 1
Find the radius of convergence R of the power series
Solution.
The sequence of the ratios 1/6. 2(2
2 is of no help. It can be shown that
+
!), 1/(8(2 + !».... doe~ not converge. so that Theorem
R = 1/L,
(6*)
This still does not help here, since
whereas for even II we have
(V/i~)
does not converge because
Vfa:J = V2 + 112n~ 1
n
as
V);J
---+
=
~
=
112 for odd
II.
00,
."r.--:
so that V lanl has the two limit points 112 and I. It can further be shown that
R
(6**)
=
Tthe greatest limit point of the sequence
1/T,
Here T = I. so that R = I. Answer. The series converges for
Izl <
{Vj:J}.
•
1.
Summary. Power series converge in an open circular disk or some even for every z (or
some only at the center. but they are useless): for the radius of convergence. see (6) or
Example 6.
Except for the useless ones, power series have sums that are analytic functions (as we
show in the next section); this accounts for their importance in complex analysis.
.. -
.•=
33-3;=-- - - -
1. (Powers missing) Show that if ~ a"z'YI has radius of
convergence R (assumed finite), then ~ a,,:!''' has radius
of convergence "\'R. Give examples.
2. (Convergence behavior) Illustrate the facts shown by
Examples 1-3 by further examples of your own.
13-181
n~O
(z
+
l)n
n
6.
2:
2
/1
4.
2: -
n~O
n!
(z
+
2i)n
9.
2:
x
(n - i)"z"
n=O
11.
2:
n=l
2100.,
--z"
11!
n=O
8.2:
RADIUS OF CONVERGENCE
2:
n=l
Il!
n~O
Find the center and the radius of convergence of the
following power series. (Show the details.)
(z + i)n
x 11"
3.
00
5. 2: n
10.2:
n=O
(_1)n+1
zn
11
12. 2:
n=O
(-I)"
22"(11!)2
_2n
-
(2:::)2"
---
(211)!
4"
(1
+ i)"
(::: - 5)"
CHAP.15
678
13.
2:
11(11 -
1)(: - 3
+
Power Series. Taylor Series
2i)n
n=2
14.2: S
co
I)n
n~O (211).
16.
2:
:JC
15. 2:
:271
2n (z - i)4"
n=O
('>+3·)n
~5 _ i' (z -
7T)n
n=O
:x;
18.
2:
n~O
(411)!
~
2 (II!)
*
*
(: + 7Ti)"
19. CAS PROJECT. Radius of Convergence. Write a
program for computing R from (6), (6*), or (6"'*). in
this order, depending on the existence of the limits
needed. Test the program on series of your choice and
15.3
such that all three formulas (6). (6*), and (6**) will
come up.
20. TEAM PROJECT. Radius of Convergence. (a)
Formula (6) for R contains iOn/On+li, not iOn+l/Oni.
How could you memorize this by using a qualitative
argument?
(b) Change of coefficients. What happens to
R (0 < R < 00) if you (i) multiply all On by k
O.
(ii) multiply On by k fl O. (iii) replace On by lion?
(c) Example 6 extends Theorem 2 to nonconvergent
cases of O../On+l' Do you understand the principle of
"mixing" by which Example 6 was obtained? Use this
principle for making up further examples.
(d) Does there exist a power series in powers of z that
converges at z = 30 + 10i and diverges at z = 31 - 6i?
(Give reason.)
Functions Given by Power Series
The main goal of this section is to show that power series represent analytic functions
(Theorem 5). Along our way we shall see that power series behave nicely under addition,
multiplication, differentiation. and integration. which makes these series very useful in
complex analysis.
To simplify the formulas in this section. we take :0 = 0 and write
(1)
This is no restriction because a series in powers of
reduced to the fonn (I) if we set i - Zo = z.
£-
Zo
with any
Zo
can always be
Terminology and Notation. If any given power selies (1) has a nonzero radius of
convergence R (thus R > 0), its sum is a function of z. say fez). Then we write
x
(2)
fez)
=
2:
a"z'''
= ao +
2
alz' + a2z +
(izi <
R).
,,~o
We say that fez) is represented by the potrer series or that it is developed in the power
series. For instance. the geometric series represents the function fez) = lI( I - z) in the
interior of the unit circle IzI = 1. (See Theorem 6 in Sec. 15.1.)
Uniqueness of a Power Series Representation. This is our next goal. It means that
a jUllctioll f(:;:;) cannot be represented by two different power series with the same
center. We claim that if fez) can at all be developed in a power series with center zoo the
development is unique. This important fact is frequently used in complex analysis (as well
as in calculus). We shall prove it in Theorem 2. The proof will follow from
SEC. 15.3
Functions Given by Power Series
THEOREM 1
679
Continuity of the Sum of a Power Series
If afunction fez) CUll he represented by a power series (2) with radius of convergence
R > 0, then fez) is continuous at ;:: = o.
PROOF
From (2) with:: = 0 we have f(O) = ao. Hence by the definition of continuity we must
show that limz~o fez) = f(O) = ao. That is, we must show that for a given E> 0 there
is a 8 > 0 such that k:1 < 8 implies If(z) - aol < E. Now (2) converges absolutely for
Izl ;:; r with any r such that 0 < r < R, by Theorem 1 in Sec. 15.2. Hence the series
1
co
L
n= I
lanlrn-l = r
co
L
lanlrn
n~l
converges. Let S*-O be its sum. (S = 0 is trivial.) Then for 0 < Izl ;:; r.
and Izls < E when Izi < 8, where 8 > 0 is less than r and less than EIS. Hence
lzls < 8S < (EIS)S = E. This proves the theorem.
•
From this theorem we can now readily obtain the desired uniqueness theorem (again
assuming ':0 = 0 without loss of generality):
THEOREM 1
Identity Theorem for Power Series. Uniqueness
Let the power series ao + alZ + (/2Z2 + ... and b o + bIZ + b 2z2 + ... both be
convergent for l:::l < R, where R is positive, and let them both have the same SUlII for
all these z. Then the series are identical, that is, ao = bo, al = bI> a2 = b 2, ....
Hence if afullction f(;:;) can be represellted by a power series with any cellfer ZO,
this representation is unique.
PROOF
We proceed by induction. By assumption,
(Izl < R).
The sums of these two power series are continuous at z = 0, by Theorem 1. Hence if we
consider 1::1 > 0 and let z ~ 0 on both sides, we see that a o = bo: the assertion is true
for n = O. Now assume that an = bn for n = 0, 1, ... , m. Then on both sides we may
omit the terms that are equal and divide the result by zm+l (*- 0); this gives
Similarly as before by letting
completes the proof.
z~
0 we conclude from this that
am+l
bm + l . This
•
CHAP. 15
680
Power Series, Taylor Series
Operations on Power Series
Interesting in itself, this discussion will serve as a preparation for our main goal, namely,
to show that functions represented by power series are analytic.
Termwise addition or subtraction of two power series with radii of convergence RI
and R2 yields a power series with radius of convergence at least equal to the smaller of
RI and R2. Proof Add (or subtract) the partial sums Sn and s:; term by term and use
lim (sn ::!: s:;) = lim Sn ::!: lim s:;.
Termwise multiplication of two power series
f(;::) =
L
akz k = ao
+ (lIZ +
k~O
and
g(Z)
L
=
bmz m
= b o + bIZ +
m~O
means the multiplication of each term of the first series by each term of the second series
and the collection of like powers of z. This gives a power series, which is called the
Cauchy product of the two series and is given by
=
L
(aob n
+
albn - l
+ ... +
(lnbO)zn.
n~O
We mention without proof that this power series converges absolutely for each Z within
the circle of convergence of each of the two given series and has the sum s(;::) = f(;::)g(z).
For a proof. see [D5] listed in App. 1.
Termwise differentiation and integration of power series is permissible, as we show
next. We call derived series qf the power series (I) the power series obtained from (1)
by termwise differentiation, that is,
x
L
(3)
nanZ n -
1
= al + 2a 2z + 3a3z2 +
n~l
THEOREM 3
Termwise Differentiation of a Power Series
The derived series of a power series has the same radius of convergence
original series.
PROOF
af
the
This follows from (6) in Sec. 15.2 because
.
lim
n~x (11
+
nlanl
1) lan+ll
I I .I I
. -n - lim
.
= hm
-an- = hm -{Inn~ n + I ~= an+l
n~::>:) a n +l
or, if the limit does not exist, from (6**) in Sec. 15.2 by noting that
\Yn ~
I as Il ~::xl.
•
SEC. 15.3
Functions Given by Power Series
E X AMP L E 1
681
Application of Theorem 3
Find the mdius of convergence R of the following series by applying Theorem 3.
~ (n)
n~2
zn
= Z2 t- 3;::3
+ 6:;;4
~ IOz5 + .
2
Solution. Differentiate the geometric series twice term by term and mUltiply the result by z2f2 This yields
the given series. Hence R = 1 by Theorem 3.
•
Termwise Integration of Power Series
THEOREM 4
The power series
an
- - - ~n+
n
+ I
1
=
a
<..
7
o~
+ -al z2 + -a2 z3 + ...
2
3
obtained by integrating the series ao + al::' + a2z2
same radius of convergence as the original series.
+
tenll by term has the
The proof is similar to that of Theorem 3.
With Theorem 3 as a tool, we are now ready to establish our main result in this section.
Power Series Represent Analytic Functions
Analytic Functions. Their Derivatives
THEOREM 5
A power series with a non:;:,ero radius of convergence R represents an analytic
function at eve I)· point interior to its circle of convergence. The derivatives of this
function are obtained by differentiating the original series tenn by tenn. All the
series thus obtained have the same radius of convergence as the original series.
Hence, by the first statement, each of them represents an a7lalytic function.
PROOF
(a) We consider any power series (1) with positive radius of convergence R. Let fez) be
its sum and fl(:) the sum of its derived series; thus
00
(4)
and
fl(::') =
L
na n z-n -
1
.
n~l
We show that fez) is analytic and has the derivative f1(z) in the interior of the circle of
convergence. We do this by proving that for any fixed z with Izl < Rand /1;:. ~ 0 the
difference quotient [fez + /1;::) - f(::.)]//1z approaches fl(z). By termwise addition we first
have from (4)
Note that the summation starts with 2, since the constant term drops out in taking the
difference fez + /1.;:) - fez), and so does the linear term when we subtract f 1 (z) from the
difference quotient.
CHAP. 15
682
Power Series, Taylor Series
(b) We claim that the series in (5) can be written
(0)
2:
anLl::[(::
+
LlZ)n-2
+
+
2z(z
.lZ)"-3
+
+
(n - 2)Zn-3(;::
n=2
+
+
LlZ)
(11 -
1)::n-2].
The somewhat technical proof of this is given in App. 4.
(e) We consider (6). The brackets contain 11 - I terms, and the largest coefficient is
1. Since (11 - 1)2 ~ 11(11 - 1), we see that for Izl ~ Ro and Iz + ~;::I ~ Ro, Ro < R.
the absolute value of this series (6) cannot exceed
11 -
(7)
This series with lin instead of lanl is the second derived series of (2) at Z = Ro and converges
absolutely by Theorem 3 of this section and Theorem I of Sec. 15.2. Hence our present
series (7) converges. Let the sum of (7) (without the factor ILlzl) be K(Ro). Since (6) is
the right side of (5), our present result is
Letting .lz ~ 0 and noting that Ro « R) is arbitrary, we conclude that f(;::) is analytic at
any point interiorto the circle of convergence and its derivative is represented by the derived
series. From this the statements about the higher derivatives follow by induction.
•
Summary. The results in this section show that power series are about as nice as we
could hope for: we can differentiate and integrate them term by term (Theorems 3 and 4).
Theorem 5 accounts for the great importance of power series in complex analysis: the
sum of such a series (with a positive radius of convergence) is an analytic function and
has derivatives of all orders, which thus in turn are analytic functions. But this is only
part of the story. In the next section we show that, conversely, every given analytic function
f(:::') can be represented by power series, called Taylor series and being the complex
analog of the real Taylor series of calculus.
11-10
I
RADIUS OF CONVERGENCE BY
DIFFERENTIATION OR INTEGRATION
Find the radius of convergence in two ways: (a) directly by
the Cauchy-Hadamard formula in Sec. 15.2. (b) from a
series of simpler telms by using Theorem 3 or Theorem 4.
cc
1.
L
11(11 -
:x;
(.;: -
2i)n
_ LOG
!:I
.
3n n(11 + 1)
(7 - 1)2n
5"-
n=l
6.
n=2
i; (11) (±)n
n=k
-l-n
2·L--n=1 n(n + 1)
(::)2n+l
n=O
1)
3n
(-I)n
4'L-211+17r
k
:x;
7.
(-7)"
L ---'------'---n= 1
8.
11(11
+
1)(11
211(21l _ I)
L -nn- -
+ 2)
..2n
:x;
n=l
..2n-2
SEC. 15.4
9.
L [(11 + 1e)]-1 zn+k
cc
n~O
10.
683
Taylor and Maclaurin Series
L
DC
Ie
(n
n~O
+ 111) z"
17. (Odd function) If .f(z) in (1) is odd (i.e.,
.f(-z) = -.f(z», show that an = 0 for even n. Give
examples.
I1l
11. (Addition and subtraction) Write our the details of
the proof on terrnwise addition and subtraction of
power series.
18. (Even functions) If .f(z) in (1) is even (i.e.,
.f( - z) = .f(z», show that an = 0 for odd n. Give
examples.
19. Find applications of Theorem 2 in differential equations
12. (Cauchy product) Show that
(1 - Z)-2 = L';;~O (n + l)zn tal by using the Cauchy
product, (b) by differentiating a suitable series.
and elsewhere
20. TEAM PROJECT. Fibonacci nmnbers.2 tal The
13. (Cauchy product) Show that the Cauchy product of
L~~O zn/n! multiplied by itself gives L~~O (2zyn/n!.
Vn ~ I as n ~
claimed in the proof of Theorem 3).
14. (On Theorem 3) Prove that
ex;
(as
15. (On Theorems 3 and 4) Find further examples of your
own.
116-201
APPLICATIONS OF THE IDENTITY
THEOREM
State clearly and explicitly where and how you are using
Theorem 2.
16. (Bionomial coefficients) Using
(1 + z)P(J
relation
15.4
+
z)q = (1
+
z)p+q. obtain the basic
Fibonacci
numbers
a o = al =
1. a n +l
are
recursively defined by
if n = 1. 2 .....
Find the limit of the sequence (an+l/an)'
(b) Fibonacci's rabbit problem. Compute a list of
a 1. .... a12' Show that a12 = 233 is the munber of
pairs of rabbits after l2 months if initially there is 1
pair and each pair generates I pair per month,
beginning in the second month of existence (no deaths
occuning).
(c) Generating function. Show that the generating
junction of the Fibonacci numbers is
.f(z) = I/(1 - z - Z2); that is, if a power series (l)
represents this .f(z), its coefficients must be the
Fibonacci numbers and conversely. Hint. Start from
.f(z) (1 - z - Z2) = I and use Theorem 2.
= an
+ an-l
Taylor and Maclaurin Series
The Taylor series 3 of a function fez), the complex analog of the real Taylor series is
(1)
where
or, by (l), Sec. 14.4,
(2)
1
a - -n 21Tl'
f
C
f(::;*)
dz*.
(z* - zd n + 1
In (2) we integrate counterclockwise around a simple closed path C that contains ::'0 in
its interior and is such that f(:::) is analytic in a domain containing C and every point
inside C.
A Maclaurin series 3 is a Taylor series with center zo = O.
2LEONARDO OF PISA, called FIBONACCI (= son of Bonaccio), about 1180-1250, Italian mathematician.
credited with the first renaissance of mathematics on Christian soil.
3BROOK TAYLOR (1685-1731), English mathematician who introduced real Taylor series. COLIN
MACLAURIN (1698--1746), Scots mathematician, professor at Edinburgh.
684
CHAP. 15
Power Series, Taylor Series
The remainder of the Taylor series (1) after the telm an(z - zo)n is
(3)
(proof below). Writing out the corresponding pmtial sum of (1). we thus have
fez) = f(2o)
+
z - zo,
-l-!- f (zo)
+
(z - ZO)2 "
2!
f (zo)
+ ...
(4)
This is called Taylor's formula with remainder.
We see that Taylor series are power series. From the last section we know that power
series represent analytic functions. And we now show that eve I}' analytic function can be
represented by power series, namely, by Taylor series (with various centers). This makes
Taylor series very important in complex analysis. Indeed. they me more fundamental in
complex analysis than their real counterparts me in calculus.
Taylor's Theorem
THEOREM 1
Let fez) be analytic in a domain D, and let z = 20 be any point in D. Then there
exists precisely one Taylor series (1) with center ':0 that represents fez). This
representation is mlid in the largest open disk with center.:o in which fez) is analytic.
The remainders Rn(z) of (1) can be represented in the f0171l (3). The coefficients
satisfy the inequality
M
lal:::S;n rn
(5)
Irhere M is the 1I1(n:ill1ll1ll of If(z)1
also in D.
PROOF
011
a circle
Iz - :01 =
r ill D whose interior is
The key tool is Cauchy's integral formula in Sec. 14.3; writing
z (so that z* is the vmiable of integration), we have
(6)
fez)
= -1.
f ---
21Tl
f(z*)
C
z* - z
z and z* instead of 20 and
dz*.
z lies inside C, for which we take a circle of radius r with center
Zo and interior in D
(Fig. 364). We develop 1/(z* - z) in (6) in powers of z - z{). By a standard algebraic
manipulation (worth remembering!) we first have
(7)
z* - z
1
z* - zo - (z - z{))
SEC. 15.4
685
Taylor and Maclaurin Series
For later use we note that since z* is on C while z is inside C, we have
z-zol<]
z* - 20
I
(7*)
(Fig. 364).
y
x
Fig. 364.
Cauchy formula (6)
To (7) we now apply the sum formula for a finite geometric sum
qn+l
I - qn+l
1
(8*)
+ q + ... + qn = --'----I - q
I - q
(q =1= 1),
i-q
which we use in the form (take the last term to the other side and interchange sides)
I
(8)
I + q + ... + qn
I - q
qn-t-l
+
]-q
Applying this with q = (z - zo)/(z* - zo) to the right side of (7), we get
I
z* - .,.
z* - Zo
[
]+
+
z - Zo
z* - Zo
( Z - Zo
Z* - Zo
+
)2
+
+
( Z-
20
z* - Zo
)nJ
I
( z - Zo )n+l
z* - Z
Z* - Zo
We insert this into (6). Powers of z - Zo do not depend on the variable of integration z*.
so that we may take them out from under the integral sign. This yields
fez)
.
I
21Ti
= -
f
f(z*)
dz*
c z* - zo·
Z - <'0
+ -21Ti
... +
1
r
f(z*)
dz*
c (z* - 20)2
(z - zo)n
21Ti
f
+ ...
f(z*)
c (z* - zo)n+l
dz*
+ Rn(z)
with Rn(z) given by (3). The integrals are those in (2) related to the derivatives, so that
we have proved the Taylor formula (4).
Since analytic functions have derivatives of all orders, we can take n in (4) as large as
we please. If we let n approach infinity, we obtain (I). Clearly, (I) will converge and
represent f(z) if and only if
(9)
lim Rn(z) = O.
n-->oo
686
CHAP. 15
Power Series, Taylor Series
We prove (9) as follows. Since .:* lies on C. whereas.: lies inside C (Fig. 364). we have
1.:* - zI > O. Since fez) is analytic inside and on C, it is bounded, and so is the function
f(::.*)/(z* - z). say.
-f(::.*)
-z* - z
I
I ~M
for all z"" on C. Also. C has the radius r = k*
ML-ineguality (Sec. 14.1) we obtain from (3)
IRnl =
(10)
..,:;
1z -
'.0
In+l
27T
Iz - zoln+l
27T
- zol and the length 27Tr. Hence by the
If
f(z*)
c (.:* - .:o)n+l(z* - .:)
M
1
r
n+l
-I
d::.*1
z - Zo
27T1" -_ M --
r
r+
1
Now Iz - ':01 < r because 2 lies inside C. Thus Iz - 20111" < L so that the right side
approaches 0 as n ~ x. This proves the convergence of the Taylor series. Uniqueness
follows from Theorem 2 in the last section. Finally, (5) follows from 0) and the Cauchy
•
inequality in Sec. 14.4. This proves Taylor's theorem.
Accuracy of Approximation. We can achieve any preassinged accuracy in
approximating f(::.) by a paI1ial sum of ( I ) by choosing n large enough. This is the practical
aspect of formula (9).
Singularity, Radius of Convergence. On the circle of convergence of 0) there is at
least one singular point of fez), that is, a point 2 = c at which fez) is not analytic (but
such that every disk with center c contains points at which fez) is analytic). We also say
that f(::.) is singular at c or has a singUlarity at c. Hence the radius of convergence R of
(1) is usually equal to the distance from z.() to the nearest singular point of f(::.).
(Sometimes R can be greater than that distance: Ln.: is singular on the negative real
axis, whose distance from Zo = - 1 + i is ], but the Taylor series of Ln ::. with center
Zo = -] + i ha<; radius of convergence V2.)
Power Series as Taylor Series
Taylor series are power series-Df course! Conversely, we have
THEOREM 2
Relation to the Last Section
A pml'er series with a non::,ero radills of convergence is the Taylor series of its SUI1I.
PROOF
Given the power series
Then f(zo) = ao. By Theorem 5 in Sec. 15.3 we obtain
f' (::.)
f"(z)
+ 2a2(Z - ':0) + 3a3(;:' = 2a 2 + 3' 2(::. - ':0) + ... ,
=
al
ZO)2
+ ... ,
thus
f'(.::o)
=
(/1
thus
f"(::.o)
=
2! a2
SEC. 15.4
687
Taylor and Maclaurin Series
and in generalln)(zo) = n! an' With these coefficients the given series becomes the Taylor
•
series of fez) with center zoo
Comparison with Real Functions. One surpnsmg property of complex analytic
functions is that they have derivatives of all orders, and now we have discovered the other
surprising property that they can always be represented by power series of the form (I).
This is not true in general for real/unctions; there are real functions that have derivatives
of all orders but cannot be represented by a power series. (Example: f(x) = exp ( - l/x 2 )
if x*"O and f(O) = 0; this function cannot be represented by a Maclaurin series in an
open disk with center 0 because all its derivatives at 0 are zero.)
Important Special Taylor Series
These are as in calculus, with x replaced by complex z. Can you see why? (Answer. The
coefficient formulas are the same.)
X AMP L E 1
Geometric Series
LeI I(z) = 11(1 - z). Then we have In)(::;)
11(1 - ::;) is the geometric series
= n!/(1
- ::;)n+l, In)(O)
= II!. Hence the Maclaurin expansion of
00
(11)
I(::;) is singular at
. LE
1-
z
=
2:
zn
= I
+ z + z2 + ...
(Izl
<
I).
n=O
z = I: this point lies on the circle of convergence.
•
Exponential Function
We know that the exponential function eZ (Sec. 13.5) is analytic for all z, and (e z )' = eZ • Hence frum (I) with
Zo = 0 we obtain the Maclaurin series
(12)
This series is also obtained If we replace x In the familiar Maclaurin series of eX by z.
Funhermore. by setting z = iy in (12) and separating the series into the real and imaginary pans (see
Theorem 2. Sec. 15.1) we obtain
Since the series on the right are the familiar Maclaurin series of the real functions cos y and sin .1', this shows
that we have rediscovered the Euler formula
(13)
e
iy
= cos y + i sin y.
Indeed, one may use (12) for definillg eZ and derive from (12) the basic propenies of eZ • For instance, the
Z
•
differentiation formula (e ) ' = eZ follows readily from (12) by termwise differentiation.
R
CHAP. 15
E X AMP L E:I
Trigonometric and Hyperbolic Functions
Power Series, Taylor Series
By substituting (12) into (1) of Sec. 13.6 we obtain
x
COSz
=
:L
.,2n
~2
~
(_1)n
+ '"
2!
4!
(_11).
n=O
A
'"
-
1
+ ...
-
(14)
z2n+l
00
sin z
=
:L
(_l)n
+ 1)!
(2n
n=O
==z-
Z3
Z5
+
3!
5!
-+
When ~ = \. these are the familiar Maclaurin series of the real functions cos x and sin x. Similarly, by substituting
(12) into (II), Sec. 13.6. we obtain
cosh
z
=
2.:
n=O
~2
Z2n
1+
(2n)!
2!
_4
+ 4! + ...
(15)
:x;
sinh Z =
2.:
n=O
)( AMP L E 4
z2n+l
+
(211
Z3
I)!
=z+
3!
_5
+
•
5!
Logarithm
From (\) it follows that
_2
Ln (1
(16)
Replacing;;; by
+ z)
=
z-
_3
+
"'2
3
- + ...
Clzi <
1).
-z and multiplying both sides by -1, we get
1
(17)
-Ln(l -;;;)
= Ln
~
.2_3
= z + '2 + '3 + ..
(kl <
1).
By adding both series we obtain
1 + ;;;
Ln - - = 2
(18)
1-
z
(
z + -~3 + -<;5 + ... )
3
5
(1;;;1 < 1). •
Practical Methods
The following examples show ways of obtaining Taylor series more quickly than by the
use of the coefficient formulas. Regardless of the method used. the result will be the same.
This follows from the uniqueness (see Theorem 1t
., L E
r
Substitution
Find the Maclaurin series of f(;;;) = 1I( I
Solution.
(19)
By substituting
-Z2
+ ;;;2).
for;: in (11) we obtain
SEC. 15.4
689
Taylor and Maclaurin Series
E X AMP L E 6
Integration
Find the Maclaurin series of fez)
Solution.
We have f' (z)
arctan z.
=
= 11(1 +
oc
arctan Z
(_I)n
Z3
2n+l
~ ~I z
=
flO) = 0 we get
Z2). Integrating (19) term by term and using
=
Z5
(izi <
z - -3 + -5 - + ...
I);
n=O _II
u + iv = arctan z defined as that value for which
this series represents the principal value of w
lui <
E X AMP L E 7
-rr12.
Development by Using the Geometric Series
Develop lI(e - z) in powers of z - zo, where e - 20
*' O.
Solutioll.
This was done in the proof of Theorem I, where e = z*. The beginning was simple algebra and
then the use of (II) with z replaced by (z - zo)/(e - zo):
~ (Z-Zo)n
e-z
e - Zo - (z - zo)
(e-ZQ)
(
zI-~
7
c - Zo
)
e - Zo
n~O
c - Zo
e-::
Z--)2
(
e - Zo
)
+ ....
This series converges for
z -- Zo
e - Zo
I
E X AMP L E 8
I
<I,
that is,
Iz - 201 <
•
Ie - zol·
Binomial Series, Reduction by Partial Fractions
Find the Taylor series of the following function with center Zo
f(z) =
Solution.
We develop
f(~)
(20)
1with 111
f(z)
=
3
Z
+
2
Z
-
L
=
8z - 12
in partial fractions and the first fraction in a binomial series
___ =
(I + Zr
(1
+
1)
mz +
m(m
+ Z)-m
(-m)
~
=
2!
Z2 -
m(m
Zn
n
n=O
+
l)(m
+
2)
3!
Z3
+ ...
2 and the second fraction in a geometric series, and then add the two series term by term. This gives
= __1 _ +
(z +
2l
_2_
i ~ (-2) (z ~
n~O
8
9
=
z - 3
I
[3
I
+
n~O
23
108 (z -
n~O
2
I) -
=
2
2 - (z - I)
)n _~ (z; I )n = ~
Il
31
54 (z - 1) -
_
(z - 1)]2
2. (
9
[(-1)~:2+
I
[I
+ !(z
I) -
) _
- 1)]2
2~
]
(Z _
I
I - ~(z - 1)
I)n
3
275
3
1944 (z - I)
We see that the first series converges for Iz - II < 3 and the second for Iz - II < 2. This had to be expected
because I/(z + 2)2 is singular at -2 and 2/(z - 3) at 3. and these points have distance 3 and 2. respectively,
•
from the center Zo = L Hence the whole series converges for Iz - II < 2.
CHAP. 15
690
Power Series, Taylor Series
•••..........
w .... ·
. _ ... lA--..
" _ ......
. - . ........
• ..-•_ .· .....
___
I~~
TAYLOR AND MACLAURIN SERIES
Find the Taylor or Maclaurin series of the given function
with the given point as center and detennine the radius of
convergence.
1. e -2z ,
3. e Z ,
0
2. I/(I
-
(3),
-2i
4. cos2
Z,
0
5. sin z,
7. 1/(1
7r12
6. 1/z.
z),
8. Ln (I - z).
-
-z2(2
,
9. e
10. e Z 2
0
f
Z6 -
+
Z4
I,
Z2 -
12. sinh (z - 2i),
0
2i
Find the Maclaurin series by tennwise integrating the
integrand. (The integrals cannot be evaluated by the usual
methods of calculus. They define the error function erf z,
sine integral Si(z). and Fresnel integrals4 S(z) and C(z).
which occur in statistics, heat conduction. optics, and other
applications. These are special so-called higher
transcendental functions.)
13. erfz = • 2f
V7r
L
15. S(z) =
f
e- t 2 dt
14. Si(z)
=
f
o
0
B2 =
I
B4 = - 30
B5
-
I
B3
6
= 0,
= O.
1
B6 = 42
t
4i
2i
tan z = e 2iz
- i
_
n=l
18. (Inverse sine) Developing
show that
arcsin z =
z+
•
SID t
--
uV I
-
Z2
and integrating,
(±) ~ + (~:!) ~
+ (~)
dt
2'4' 6
_7
+.
7
(izl
<
1).
Z
sin t 2 dt
Show that this series represents the principal value of
arcsin z (defined in Team Project 30. Sec. 13.7).
o
17. CAS PROJECT. sec, tan, arcsin. (a) Euler numbers.
The Maclaurin series
E22
(21) sec z = Eo - z
2!
+ -E44
z
4!
-
+ ...
defines the Elller numbers E 2n- Show that Eo = 1,
E2 = -I, E4 = 5, E6 = -61. Write a program that
computes the E 2n from the coefficient formula in (1)
or extracts them as a list from the series. (For tables
see Ref. [GRI]. p. 810. listed in App. 1.)
(b) Bernoulli numbers. The Maclaurin series
(22)
,
(c) Tangent. Using (1), (2), Sec. 13.6, and (22), show
that tan z has the following Maclaurin series and
calculate from it a table of Bo, ... , B2O:
(24)
HIGHER TRANSCENDENTAL
FUNCTIONS
Z
2
Write a program for computing Bn.
e- t 2 dt,
Z
Bl =
(23)
0
0
11.
defines the Bernoulli numbers Bn. Using undetermined
coefficients, show that
z
e' - 1
19. (Undetennined coefficients) Using the relation
sin z = tan Z cos Z and the Maclaurin series of sin z and
cos z, find the first four nonzero terms of the Maclaurin
series of tan z. (Show the details.)
20. TEAM PROJECT. Properties from Maclaurin
Series. Clearly, from series we can compute function
values. In this project we show that properties of
functions can often be discovered from their Taylor or
Maclaurin series. Using suitable series, prove the
following.
(a) The fonnulas for the derivatives of e2 , cos z, sin z,
cosh Z, sinh z, and Ln (1 + z)
(b)
4(i + e-iz )
Z
(c) sin z
=1=
=
cos Z
0 for all pure imaginalY Z = iy
*" 0
4AUGUSTIN FRESNEL (1788-1827), French physicist and engineer, known for his work in optics
SEC. 15.5
15.5
Uniform Convergence.
691
Optional
Optional
Uniform Convergence.
We know that power series are absolutely convergent (Sec. 15.2, Theorem 1) and, as
another basic property, we now show that they are ul1ifOlmly convergent. Since uniform
convergence is of general importance, for instance, in connection with termwise integration
of series, we shall discuss it quite thoroughly.
To define uniform convergence, we consider a series whose terms are any complex
functions f o(z), f 1 (z) . ... :
oc
L
(1)
fm(z) = fo(z)
+
fl(z)
+ f2(z) + ....
m~O
(This includes power series as a special case in which f m(Z) = am (Z - Z-O)"m.) We assume
that the series (1) converges for all z in some region G. We call its sum s(z) and its nth
partial sum sn(z); thus
Convergence in G means the following. If we pick a z = ZI in G, then, by the definition
of convergence at 210 for given E > 0 we can find an N 1(E) such that
If we pick a
22
in G, keeping
E
as before, we can find an N2 ( E) such that
and so on. Hence, given an E > 0, to each Z in G there corresponds a number Nzt E).
This number tells us how many terms we need (what Sn we need) at a Z to make
Is(:) - sn(z)1 smaller than E. Thus this number NiE) measures the speed of
convergence.
Small Ni E) means rapid convergence. large NzC E) means slow convergence at the point
z considered. Now, if we can find an N(E) larger than all these NzCE) for all z in G, we
say that the convergence of the series (1) in G is uniform. Hence this basic concept is
defined as follows.
DEFINITION
Uniform Convergence
A series (1) with sum s(z) is called uniformly convergent in a region G if for every
E > 0 we can find an N = N( E), not depelldillg Oil Z, such that
for all n > N( E) alld all z ill G.
UniformilY of convergence is thus a property that always refers to an it~tillite set in
the z-plane, that is, a set consisting of infinitely many points.
CHAP. 15
692
E X AMP L E 1
Power Series, Taylor Series
Geometric Series
Show that the geometric series 1 + Z + Z2 + ... is (a) uniformly convergent in any closed disk
(b) not uniformly convergent in its whole disk of convergence Izl < 1.
Solution.
1111 -
zl ~
(a) For z in that closed disk we have
11 -
zl ~
I - r
(sketch it).
Izl ~ r <
I.
This implies that
l/(1 - r). Hence (remember (8) in Sec. 15.4 with q = z)
Is(z) - sn(z)1 =
I
I I I~
L
00
n+l
zm =
; _
z
n+l
; _ r .
m=n+l
Since r < I, we can make the right side as small as we want by choosing n large enough, and since the right
side does not depend on z (in the closed disk considered), this means that the convergence is uniform.
(b) For given real K (no maner how large) and
zn+l
I1 - z
11
I
we can always find a z in the disk
=
Izln+l
zl
11 -
Izl <
1 such thaI
> K
'
simply by taking z close enough tu 1. Hence no single N( E) will suffice to make Is(z) - sn(z)1 smaller than a
given E > 0 throllghollt the whole disk. By definition, this shows that the convergence of the geometric series
•
in Izl < I is not uniform.
This example suggests thatfor a power series, the unifomlity of convergence may at most
be disturbed near the circle of convergence. This is true:
1 'UOREM 1
Uniform Convergence of Power Series
A power series
(2)
m=O
with a nonzero radius of convergence R is uniformly convergent in every circular
disk Iz - Zol 2 I' of radius I' < R.
PROOF
For
Iz -
zol 2
I'
and any positive integers nand p we have
Now (2) converges absolutely if Iz - zol = I' < R (by Theorem 1 in Sec. 15.2). Hence it
follows from the Cauchy convergence principle (Sec. 15.1) that. an E> 0 being given.
we can find an N( E) such that
for n > N( E)
and
p
= I, 2, ....
From this and (3) we obtain
for all z in the disk Iz - zol 2 r. every n > N(E), and every p = 1,2, .... Since N(E) is
independent of z, this shows uniform convergence, and the theorem is proved.
•
Theorem 1 meets with our immediate need and concern, which is power series. The
remainder of this section should provide a deeper understanding of the concept of uniform
convergence in connection with arbitrary series of variable terms.
SEC. 15.5
Uniform Convergence.
693
Optional
Properties of Uniformly Convergent Series
Unifonu convergence derives its main importance from two facts:
1. [f a series of continuous tenus is unifonuly convergent, its sum is also continuous
(Theorem 2, below).
2. Under the same assumptions, tenuwise integration is permissible (Theorem 3).
This raises two questions:
1. How can a converging series of continuous tenus manage to have a discontinuous
sum? (Example 2)
2. How can something go wrong in termwise integration? (Example 3)
Another natural question is:
3. What is the relation between absolute convergence and unifonu convergence? The
surprising answer: none. (Example 5)
These are the ideas we shall discuss.
If we add finitely many continuous functions, we get a continuous function as their sum.
Example 2 will show that this is no longer true for an infinite series, even if it converges
absolutely. However, if it converges uniformly, this cannot happen, as follows.
THEOREM 2
Continuity of the Sum
Let the series
2:
f",(z)
=
fo(z)
+
fl(Z)
+ ...
m=O
be ull(fonnly convergent in a region C. Let F(z) be its sum. Then if each term f m(Z)
is continuous at a point ZI in C, the junction F(z) is continuous at ZI'
PROOF
Let sn(:::) be the nth partial sum of the series and Rn(::,J the corresponding remainder:
sn
=
f0
+
f1
+ ... +
f n'
Rn
=
fn+l
+
fn+2
+ ...
Since the series converges uniformly, for a given E> 0 we can find an N = N(E) such
that
for all Z in C.
Since SN(Z) is a sum of finitely many functions that are continuous at
continuous at ZI' Therefore. we can find a [) > 0 such that
Using F
= SN + RN and the triangle inequality (Sec.
This implies that F(z) is continuous at
Zl.
13.2), for these
and the theorem is proved.
21'
this sum is
z we thus obtain
CHAP. 15
4
2
Power Series, Taylor Series
Series of Continuous Terms with a Discontinuous Sum
Consider the series
(x real)
This is a geometric series with q = 1/(1
sn(x) = x
2[
I +
+ x 2 ) times a factor x 2 . Its /lth partial sum is
I
+
---2
I +x
(I
+
I 2 2 + ... +
x )
I 2] '
(I + x )n
We now use the trick by which one finds the sum of a geometric series, namely, we multiply
sn(x) by -q = -I/(I
-
+
2
x ),
__
1-2 ST/(x) = -x2 [ _ _
1-2
I
+
I
x
+
+ ... +
x
(I
2
+
x )n
+
(1
+
1 ]
12
X
)n+
.
Adding thIs to the previous formula, simplifying on the left, and canceling most terms on the right, we obtain
X2
2 [
- - - 2 Sn(x) = x
I+x
I
thus
*' 0, the sum is
The exciting Fig. 365 "explains" what is going on. We see that if x
Sex) = lim sn(x) = I
°1l_X
+
2
x ,
but for x = 0 we have s,,(O) = I - I = 0 for all /1, hence s(O) = O. So we have the surprising fact that the
sum is discontinuous (at x = 0), although all the tenns are continuous and the series converges even absolutely
(its terms are nonnegative. thus equal to their ab~olute value!).
Theorem 2 now tells us that the convergence cannot be uniform in an interval containing x = O. We can also
verify this directly. Indeed. for x 0/= 0 the remainder ha, the absolute value
I
IRn(x)1 = Is(x) - sn(x)1 =
(I
+
2
x)
n
and we see that for a given E « I) we cannot find an N depending only on
and all x, say, in the interval 0 ~ x ~ I.
E
y
2
8
8
8
4
1.5
8
64
)
I
8 16,
~
8
1
,1 1/
-1
"'iR.365.
0
Partial sums in Example 2
x
such that IRnl <
E
for all n > Nf E)
SEC. 15.5
I
695
Optional
Uniform Convergence.
ermwise Integration
This is our second topic in connection with unifonn convergence, and we begin with an
example to become aware of the danger of just blindly integrating tenn-by-tenn.
X AMP L E 3
Series for which Termwise Integration is Not Permissible
Let IImtX) = IIIxe -.".,?- and consider the series
where
in the interval 0
~
x
~
1. The nth partial sum is
Hence the series has the sum F(x) = lim sn(x) = lim II n (x) = 0
n-+oo
n--+oo
I
(0
~
x ~ 1). From this we obtain
I
F(x) dx = O.
o
On the other hand. by integrating term by term and using
I
oc
:L
-rn=1
Now sn =
£I"
n
1
:L
n--+oom=l
f mIx) dx = lim
0
I
0
fI + f2 + ... +
Ii
1
f mIx) ~x = lim
fn = sn' we have
n--+x
Sn(X) dx.
0
and the expression on the right becomes
lim
n--+x
I
1
unCx)
0
dx
= lim
n--+oo
I
1
0
Ilxe -n.i'
dx
1
=
lim -2 (1 - e -n)
n--+Xi
=
21 .
but not O. This shows that the serie, under consideration cannot be integrated term by term from x = 0 to
= I.
•
X
The series in Example 3 is not unifonnly convergent in the interval of integration, and
we shall now prove that in the case of a unifonnly convergent series of continuous
functions we may integrate term by tenn.
"HOREM
Termwise Integration
Let
00
Hz) =
2:
I m(z)
=
Io(z)
+
fl(Z)
+
m=O
be a uniformly convergent series of continuous functions in a region G. Let C be
any path in G. Then the series
(4)
is convergent and has the sum
Ic F(z) dz..
CHAP.15
696
PROOF
Power Series, Taylor Series
From Theorem 2 it follows that F(z) is continuous. Let s,.(::) be the 11th partial sum of the
given series and R,,(::') the corresponding remainder. Then F = Sn + Rn and by integration,
J
c
F(z) elz =
J
c
+
s,.(z) liz
J
c
Rn (::.) elz.
Let L be the length of C. Since the given series converges uniformly, for every given
E > 0 we can find a number N such that IRn(z)1 < ElL for all Il > N and all ::. in G. By
applying the ML-inequality (Sec. 14.1) we thus obtain
E
< - L
L
Since Rn = F -
Sn,
=
for all n > N.
E
this means that
for all
fl
> N.
•
Hence, the series (4) converges and has the sum indicated in the theorem.
Theorems 2 and 3 characterize the two most important properties of uniformly convergent
selies. Also, since differentiation and integration are inverse processes, Theorem 3 implies
THEOREM 4
Termwise Differentiation
Let the series f o(z) + f 1 (z) + f 2(Z) + ... be convergent in a region G and let F(z)
be its sum. Suppose that the series f~(z) + f~(::.) + f~(z) + ... converges ulliformly
in G anel its terms are cOluinuollS in G. Then
F'(z)
=
f~(::.)
+
f~(::.)
+ f~(z) +
for all ::. in G.
~------------------------------------------------------------------~
Test for Uniform Convergence
Uniform convergence is usually proved by the following comparison test.
THEOREM 5
Weierstrass' M-Test for Uniform Convergence
COil sider a series oftlze f01711 (1) ill a region G of the ::.-plane. Suppose that one can
find a convergent series qf cOllstallf terms,
(5)
such that If...(z)I ~ M", for all z in G and every
unifo17nZy cOllvergent ill G.
111
= 0,
1, .. ,
Then (1)
IS
5 KARL WEIERSTRASS (1815-1897), great German mathematician. who~e lifework was the development
of complex analysis based on the concept of power selies (see the footnote in Sec. 13.4). He aho made basic
contributions to the calculus. the calculus of variations. approximation theory. and differential geometry. He
obtained the concept of uniform convergence in 1841 (published 1894. sid); the first publication on the concept
was by G. G. STOKES (see Sec 10.9) in 1847.
SEC. 15.5
Uniform Convergence.
697
Optional
The simple proof is left to the student (Team Project 18).
E X AMP L E 4
Weierstrass M-Test
Does the following series converge uniformly in the disk
Izl : ":'
I?
Uniform convergence follows by the Weierstrass M-test and the convergence of LIIm 2 (see
Sec. 15.1. in the proof of Theorem 8) because
Solution.
_'m
I
/11
2
+ I
+ cosh mlzl
•
2
<
2
III
No Relation Between Absolute and
Uniform Convergence
We finally show the surprising fact that there are series that converge absolutely but not
uniformly, and others that converge uniformly but not absolutely, so that there is no
relation between the two concepts.
E X AMP L E 5
No Relation Between Absolute and Uniform Convergence
The series in Example 2 converges absolutely but not uniformly, as we have shown. On the other hand, the series
(_l)m-l
m=l
.\"2
+
+ m
(x real)
converge, uniformly on the whole real line but not absolutely.
Proof By the familiar Leibniz test of calculus ~see App. A3.3) the remainder Rn does not exceed its first
term in absolute value, since we have a series of alternating terms whose absolute values form a monotone
decreasing sequence with limit zero. Hence given E > 0, for all x we have
ifn >
I
N~E) ~ -
E
This proves uniform convergence, since N~ E) does not depend on x.
The convergence is not ab,olute because for any fixed x we have
k
>m
•
where k is a suitable consrant. and kL 11m diverges.
/1-8/
UNIFORM CONVERGENCE
Prove that the given series converges uniformly
indicated region.
1.
.L
n=O
(z - 2i)2n,
/z - 2i/ ::":' 0.999
00
In
the
2.2:
n=Q
...,2n+l
(2n
+ I)! '
CHAP. 15
698
Power Series, Taylor Series
(d) Example 2. Find the precise region of
convergence of the series in Example 2 with x replaced
by a complex variable z.
(e) Figure 366. Show that x 2 ~;;'~1 (1 + x 2 )-m = 1
if x =F 0 and 0 if x = O. Verify hy computation that the
partial sums .1'10 S2' S3, look as shown in Fig. 366.
_n
5.
L ;12 ' Izl;§: I
n~l
6.
L
n=l
7.
L
n=O
00
8.
L
n=l
[9-161
zn
n 2 cosh Ilkl
tanhn
/1
2
+
Izl ;§:
I
cos nlzl
,
y
1
Izl Izl ;§:
~
I
Izl ;§:
s
1010
10 20
-1
o
x
Fig. 366. Sum 5 and partial
sums III Team Project 18(e)
POWER SERIES
Find the region of uniform convergence. (Give reason.)
x (~+ I - 2;)n
x (Z _ ;)2n
9.
4"
10.
(2/l)!
L
n=O
L
n=O
12.
14.
L
(n)
n-2
2
<Xl
L
(3n
(2z - i)n
tanh 11);;:2n
119-201 HEAT EQUATION
Show that (9) in Sec. 12.5 with coefficients (10) is a solution
of the heat equation for t > 0, assuming that f(x) is continuous
on the interval 0 ;§: x ;§: L and has one-sided derivatives at
all interior points of that interval. Proceed as follows.
19. Show that IB"I is bounded, say
Conclude that
n=l
15.
L
n-l
_2n
~
16.
2
Y'n
L
n=O
(_l)n?n
(2n)!
17. CAS PROJECT. Graphs of Partial Sums. (a) Figure
365. Produce this exciting figure using your software
and adding fm1her curves. say, those of S256' SI024' etc.
(b) Power series. Study the nonuniformity of
convergence experimentally by plotting partial sums near
the endpoints of the convergence interval for real z = x.
18. TEAM
PROJECT.
Uniform
Convergence.
(a) Weierstrass M-test. Give a proof.
(b) Termwise differentiation. Derive Theorem 4
from Theorem 3.
(c) Subregions. Prove that uniform convergence of a
series in a region C implies unifonn convergence in
any portion of C. Is the converse true?
.
.--. ..-.
to. _ _ _ _ _ ..
1. What are power series? Why are these series very
important in complex analysis?
2. State from memory the ratio test, the root test, and the
Cauchy-Hadamard fomlLlla for the radius of
convergence.
3. What is absolute convergence? Conditional convergence?
Uniform convergence?
if
IB"I
<
t ~
K for all n.
to >
0
and. by the Weierstrass test. the series (9) converges
uniformly with respect to x and t for f ~ fo, 0 ;§: x ;§: L.
Using Theorem 2. show that II(X, t) is continuous for
1 ~ 10 and thus satisfies the boundary conditions (2)
for f ~ fo.
20. Show that Iillln/iltl < An2 Ke-An2to if 1 ~ to and the
series of the expressions on the right converges. by the
ratio test. Conclude from this. the Weierstrass test, and
Theorem 4 that the series (9) can be differentiated term
by term with respect to t and the resulting series has
the sum duliN. Show that (9) can be differentiated twice
with respect to x and the resulting series has the sum
a2 u/ilx 2 . Conclude from this and the result to Prob. 19
that (9) is a solution of the heat equation for all
t ~ to. (The proof that (9) satisfies the given initial
condition can be found in Ref. [CIO] listed in App. 1.)
STIONS AND PROBLEMS
4. What do you know about the convergence of power
series?
5. What is a Taylor series? What was the idea of obtaining
it from Cauchy's integral formula?
6. Give examples of practical methods for obtaining
Taylor series.
7. What have power series to do with analytic functions?
699
Summary of Chapter 15
8. Can propel1ies of functions be discovered from their
Maclaurin series? If so, give examples.
9. Make a list of Maclaurin series of c. cos z. sin z,
cosh z, sinh z, Ln (1 - z) from memory.
10. What do you know about adding and multiplying power
series?
111-201
RADIUS OF CONVERGENCE
TAYLOR AND MACLAURIN SERIES
Find the Taylor or Maclaurin series with the given point as
center and determine the radius of convergence. (Show
details.)
21. e".
22. Ln z.
7ri
23. 1/(1 - z),
25. 11(1 -
Find the radius of convergence. Can you identify the sum
as a familiar function in some of the problems? (Show the
details of your work.)
~
~1-301
27. liz,
29. cos Z,
d,
2
24. 11(4 - 3z),
-1
28. I"t- 1(e t
-i
o
30. sin2
~7r
+i
i
26. l1z2,
0
1
-
1) dt,
0
z, 0
(3z)n
11. L . . . - n!
n=O
n=l
Z2n+l
13'L
n~O 2n +
14.
1
16.
L
(-I)n zn
n~O
(2n)!
L
n~O
(-I)n
(z -
+
(2n
2y2n+l
I)!
(2z)2n
18.
n=O
20.
L --
n~O
(217)!
L
n~O
(z -
(3
i)"
+ 4i)n
31. Does every function fez) have a Taylor series?
32. Does there exist a Taylor series in powers of z - 1 - i
that diverges at 5 + 5i but converges at 4 + 6i?
33. Do we obtain an analytic function if we replace x by z
in the Maclaurin series of a real function f(x)?
34. Using Maclaurin series. show that if fez) is even. its
integral (with a suitable constant of integration) is
odd.
35. Obtain the first few terms of the Maclaurin series of
tan z by using the Cauchy product and
sin z
=
cos z tan z.
Power Series, Taylor Series
Sequences. series, and convergence tests are discussed in Sec. 15.1. A power series
is of the form (Sec. 15.2)
....,
(I)
n~O
Zo is its center. The series (1) converges for Iz - zol < R and diverges for
Iz - ;:';01 > R, where R is the radius of convergence. Some power series converge
for all z (then we write R = (0). In exceptional cases a power series may converge
only at the center; such a series is practically useless. Also, R = lim la"lan + 11 if this
limit exists. The series (I) converges absolutely (Sec. 15.2) and uniformly
(Sec. 15.5) in every closed disk Iz - zol ~ r < R (R > 0). It represents an analytic
function fez) for Iz - Zol < R. The derivatives t(z). f"(;::.), ... are obtained by
termwise differentiation of (1 ). and these series have the same radius of convergence
R as (1). See Sec. 15.3.
700
CHAP. 15
Power Series, Taylor Series
Conversely, every analytic function .f(::.) can be represented by power series. These
Taylor series of .f(z) are of the form (Sec. 15.4)
x
(2)
.f(z) =
L
1
I""
tnl(zo)(z - z.o)n
(Iz - zol < R),
n=O 11.
as in calculus. They converge for all z in the open disk with center Zo and radius
generally equal to the distance from :::0 to the nearest singularity of .f(:::) (point at
which .f(z) ceases to be analytic as defined in Sec. 15.4). If .f(z) is entire (analytic
for all :::; see Sec. 13.5). then (2) converges for all ;:. The functions eZ , cos z, sin:::,
etc. have Maclaurin series, that is, Taylor series with center 0, similar to those in
calculus (Sec. 15.4).
CHAPTER
/
16
Laurent Series.
Residue Integration
Laurent series generalize Taylor series. Indeed, whereas a Taylor series has positive integer
powers (and a constant term) and converges in a disk, a Laurent series (Sec. 16.1) is a
series of positive and negative integer powers of z - '::0 and converges in an annulus (a
circular ring) with center Zoo Hence by a Laurent series we can represent a given function
f(z) that is analytic in an annulus and may have singularities outside the ring as well as
in the "hole" of the annulus.
We know that for a given function the Taylor series with a given center '::0 is unique.
We shall see that, in contrast, a function f(z) can have several Laurent series with the
same center '::0 and valid in several concentric annuli. The most important of these series
is that which converges for 0 < Iz - zol < R. that is, everywhere near the center ::'0 except
at Zo itself. where Zo is a singular point of f(z). The series (or finite sum) of the negative
powers of this Laurent series is called the principal part of the singularity of f(z) at Zo,
and is used to classify this singularity (Sec. 16.2). The coefficient of the power 1/(;: - zo)
of this series is called the residue of f(z) at zoo Residues are used in an elegant and
powerful integration method, called residue integration, for complex contour integrals
(Sec. 16.3) as well as for certain complicated real integrals (Sec. 16.4).
Prerequisite: Chaps. 13, 14, Sec. 15.2.
Sections that may be omitted in a shorter course: 16.2, 16.4.
References and Answers to Problems: App. 1. Part 0, App. 2.
16.1
Laurent Series
Laurent series generalize Taylor series. If in an application we want to develop a function
f(z) in powers of Z - Zo when f(z) is singular at Zo (as defined in Sec. 15.4). we cannot
use a Taylor series. Instead we may use a new kind of series, called Laurent series, 1
consisting of positive integer powers of::. - Zo (and a constant) as well as negative integer
powers of z - ':::0; this is the new feature.
Laurent series are also used for classifying singularities (Sec. 16.2) and in a powerful
integration method ("residue integration", Sec. 16.3).
A Laurent series of f(::.) converges in an annulus (in the "hole" of which f(.:::) may have
singularities), as follows.
IPIERRE ALPHONSE LAURENT (1813-1854). French military engineer and mathematician, published the
theorem in 1843.
701
702
THEOREM 1
CHAP. 16
Laurent Series. Residue Integration
Laurent's Theorem
Let fez) be analytic in a domain c()ntaining two concentric circles C1 and C2 with
center Zo and the annulus betrveen them (blue in Fig. 367). Then fez) can be
represented by the Laurent series
(1)
... +
z - Zo
consisting of nonnegative lind negative powers. The coefficients of this Laurent series
are given by the integrals
(2)
taken coullterclockwise around allY simple closed path C that lies in the annulus
and encircles the inner circle, as in Fig. 367. [The variable of integration is denoted
by z* since z is used in (1).]
This series converges and represents fez) in the enlarged open allnulus obtained
from the given annulus by continuously increasing the outer circle C1 and decreasing
C2 until each of the fiFO circles reaches a point where fez) is singular.
III the important special case that :.':0 is the ollly singular point of fez) inside C2 ,
this circle can be shrunk to the point zo, giving convergence in a disk except at the
center. In this case the series (or finite sum) of the negative powers of (1) is called
the principal part of the singularity of fez) at zoo
.--
-
I
\
\
\
Fig. 367.
Laurent's theorem
COMMENT. Obviollsly, instead of (1). (2) we may write (denoting bn by a_ n )
(1')
n=-:JO
SEC. 16.1
703
Laurent Series
where all the coefficients are now given by a single integral formula, namely,
a
(2')
PROOF
n
= -1-
T
f(z*)
d-*
c (z* - Zo)n+1
~
2wi
(n = 0, ±l, ±2, .. ').
We prove Laurent's theorem. (a) The nonnegative powers are those of a Taylor series.
To see this, we use Cauchy's integral fOlmula (3) in Sec. 14.3 with z* (instead of z) as
the variable of integration and z instead of ~o. Let g(z) and h(::.) denote the functions
represented by the two terms in (3), Sec. 14.3. Then
fez)
(3)
=
g(z)
+ hU;) = -1.
2Wl
T -f(z*)- d::* c, z* - ::.
I
-.
2m
T -f(z*)- dz*.
C
z* -
2
Z
Here::. is any point in the given annulus and we integrate counterclockwise over both C1
and C2 , so that the minus sign appears since in (3) of Sec. 14.3 the integration over C2 is
taken clockwise. We transform each ofthese two integrals as in Sec. 15.4. The first integral
is precisely as in Sec. 15.4. Hence we get precisely the same result, namely, the Taylor
series of g(z),
I
g(z) = --.
(4)
2Wl
T -f(:;::*)
- - dz*
c, z* - Z
=
= an(z :L
zo)n
n~O
with coefficients [see (2), Sec. 15.4, counterclockwise integration]
an
(5)
=
I
-?
_WI
T
c,
(
f(z*)
d-*
z· - Zo )n+1 ~.
ok
Here we can replace C 1 by C (see Fig. 367), by the principle of deformation of path, since
Zo, the point where the integrand in (5) is not analytic, is not a point of the annulus. This
proves the formula for the an in (2).
(b) The negative powers in (1) and the formula for bn in (2) are obtained if we consider
h(z) (the second integral times -J/(2wi) in (3). Since z lies in the annulus, it lies in the
exterior of the path C2 . Hence the situation differs from that for the first integral. The
essential point is that instead of [see (7*) in Sec. 15.4]
(6)
(a)
I z-::'°I<1
z* - Zo
we now have
z* - Zo
(b)
1
1
<
1.
z - Zo
Consequently, we must develop the expression I/(z* - z) in the integrand of the second
integral in (3) in powers of (::.* - Zo)/(z - Zo) (instead of the reciprocal of this) to get a
convergent series. We find
1
z* - z
-1
::.* - Zo - (z -
::'0)
(z - Zo)
(1 _ z*z -- Zozo)
.
704
CHAP. 16
Laurent Series. Residue Integration
Compare this for a moment with (7) in Sec. 15.4. to really understand the difference. Then
go on and apply formula (8), Sec. 15.4. for a finite geometric sum. obtaining
1-=--1-Z* - Z
;: - Zo
{ 1+ z* - Zo + (z* - Zo )2 + ... +
z - ':0
::. - Zo
__
1
(Z* - zO)n+l
z-z*
Z-2o
Multiplication by -f(.:*)/27Ti and integration over C2 on both sides now yield
1
l1(z) = - --.
27Tl
_1_.
27TI
f
-f(z*)
- - dz*
e2 z* - z
{_1_ r1
Z - Zo
f(z*) dz*
+
e2
+
1
(z - ZO)2
1
(z - zo)n+l
+ ...
1
(z* - 2o)f(z*) dz*
1
r
(z* - zo)nf(z*) dZ""'-}
re2
+ Rn*(z)
e2
with the last term on the right given by
R*(
(7)
n
z)
=
1
?
(
)n+l
_7rIZ-2o
1
re
2
(z* - ::o)n+l f(z*) dz*.
z-z'ok
As before. we can integrate over C instead of C2 in the integrals on the right. We see that
on the right, the power 1/(z - zo)n is multiplied by b n as given in (2). This establishes
Laurent's theorem, provided
lim R~(z)
(8)
~x
= O.
(c) COllvergellce proofof (8). Very often (1) will have only finitely many negative powers.
Then there is nothing to be proved. Otherwise, we begin by noting that f(z*)/(z - z*) in
(7) is bounded in absolute value, say.
I
f(z*)
z - z*
I<
M
for all z* on C2
because f(z*) is analytic in the annulus and on C2 , and z* lies on C2 and z outside, so
that z - z* =/= O. From this and the ML-inequality (Sec. 14.1) applied to (7) we get the
inequality (L = length of C2 , Iz* - zol = radius of C2 = const)
~
oki
lR;"(z)
1
~ 2 1
7T Z - Zo
~
In+l Iz* - zoln+l ML
ML
= -
27T
I
z* - 20 In+l
Z - Zo
SEC. 16.1
705
Laurent Series
From (6b) we see that the expression on the right approaches zero as n approaches infinity.
This proves (8). The representation (1) with coefficients (2) is now established in the given
annulus.
(d) C01lverge1lce of (1) i1l the e1llarged a1l1lulus. The first series in (1) is a Taylor
series [representing g(z)]; hence it converges in the disk D with center Zo whose radius
equals the distance of the singularity (or singularities) closest to zoo Also, g(z) must be
singular at all points outside C I where fez) is singular.
The second series in (I), representing h(z), is a power series in Z = 1/(z - Zo). Let the
given annulus be 1"2 < Iz - zol < r l , where 1"1 and r2 are the radii of C I and C2, respectively
(Fig. 367). This corresponds to 1/r2 > Izi > lirl' Hence this power series in Z must
converge at least in the disk Izi < 1/r2' This corresponds to the exterior Iz - Zol > r2 of
C2. so that h(z) is analytic for all z outside C2. Also, h(z) must be singular inside C2 where
fez) is singular, and the series of the negative powers of (I) converges for all z in the exterior
E of the circle with center Zo and radius equal to the maximum distance from <'0 to the
singularities of fez) inside C2. The domain common to D and E is the enlarged open annulus
characterized near the end of Laurent's theorem, whose proof is now complete.
•
Uniqueness. The Laurent series of a given analytic function fez) in its annulus of
cOllvergence is ullique (see Team Project 24). However. fez) may have different Laurent selies
ill two anlluli with the same center; see the examples below. The uniqueness is essential. As
for a Taylor series, to obtain the coefficients of Laurent series, we do not generally use the
integral formulas (2); instead, we use various other methods, some of which we shall illustrate
in our examples. If a Laurent series has been found by any such process, the uniqueness
guarantees that it must be the Laurent series of the given function in the given annulus.
E X AMP L E 1
Use of Maclaurin Series
Find the Laurent series of z-5 sin:: with center O.
Solutio1l.
By (14). Sec. 15.4. we obtain
-5.
::
~
(-I)n
SIn Z =:::0 (2n
+
I
2n-4
I)! Z
=
I
6;?
;:4 -
I
+
I
120 -
2
5040 z
+ - ...
(Izl >
0).
Here the "annulus" of convergence is the whole complex plane without the origin and the pl'incipal part of
the series at 0 is Z-4 - ~Z-2.
•
E X AMP L E 2
Substitution
Find the Laurent series of z2 e1/z with center O.
Solution.
From l12) in Sec. 15.4 with.:: replaced by liz we obtain a Laurent senes whose principal part is
an infinite series,
(Izl >
E X AMP L E 3
0). •
Development of 1/(1 - z)
Develop 1/(1 - z)
(a) in nonnegative powers
of~,
(b) in negative powers of z.
Solutio1l.
I
x
= L -- n
1- z
(a)
(valid if Izl < I).
n=O
(b)
I - z
-]
z(l - Z-l) = -
=
n~o
I
zn+l
I
= -
~
I
-
:;2 - . . .
(valid if Izl > I). •
706
E X AMP L E 4
CHAP. 16
Laurent Series. Residue Integration
Laurent Expansions in Different Concentric Annuli
Find all Laurent series of 1I(~3 - ~ 4) with center O.
Solution.
Multiplying by IIz
3
we get from Example 3
,
I
a:J
L:
-3--4 =
(I)
Z
-z
;;:n-3 =
n=O
E X AMP L E 5
z -
L:
-
n+4
n-O Z
Z
Use of Partial Fractions
<
Izl <
1).
;; _ ...
3
(Izl >
1).
•
+3
-2z
Solution.
(0
Z
Z
x
I
-3--4 =
(II)
I
+-+I+z+'"
2 +
Z3
In terms of partial fractions,
I
f(::.) = -~
::.-2
(a) and (b) in Example 3 take care of the first fraction. For the second fraction,
x
(c)
~ z)
z-2
2 (I -
(d)
z - 2
z
(I) From (a) and (c), valid for
fez)
Izl <
=
1(z)
=
L:
(1 +
<
Izl <
1
a:J
2n+l ;;:n -
n=O
L:
n=O
(Ill) From ~d) and (b). valid for
Izl >
2nl+1) zn
= -
L:
(1::.1 < 2),
n=O
= -
L:
2n
(izi >
';"n+l
2).
n=O .....
%+ %z + ~
=
I
7n+1
+ ...
=
2
~
2,
1
(2n + I)
n=O
Z2
2,
00
fez)
I
2n+l zn
L:
I (see Fig. 368),
= n~o
(II) From (c) and (b), valid for 1
(I -~)
=
n+l
2
= - -
Z
9
3
Z
•
y
...... III
II
I
Fig. 368.
"
x
Regions of convergence in Example 5
If fez) in Laurent's theorem is analytic inside C2 , the coefficients b n in (2) are zero by
Cauchy's integral theorem, so that the Laurent series reduces to a Taylor series. Examples
3(a) and 5(1) illustrate this.
SEC. 16.2
~1=6J
707
Singularities and Zeros. Infinity
115-231
LAURENT SERIES NEAR A SINGULARITY
ATO
Expand the given function in a Laurent series that
converges for 0 < Id < R and determine the precise region
of convergence. (Show the details of your work.)
1.
4.
_3
Z-3 e llz
17-141
17.
2
6.
Z2 -
Z
21.
Z3
LAURENT SERIES NEAR A SINGULARITY
AT Zo
Expand the given function in a Laurent series that
converges for 0 < I:: - ::01 < Rand detennine the precise
region of convergence. (Show details.)
eZ
7. - - ,
z-
9.
ll.
Z2
(z
I
+
+
I
8.
= i
10.
Zo =
':::0
I
2
i) - (z
+
i)
'
::0
(Z
+
cos.:::
i)
2.
1
14. ::: smh - ,
Z
16.2
Zo =
-i
::0
0
=
7T)4
,
Zo =
13.
~7T
7T
~4
4
I
0
16.
, :::0 = 0
18.
:::0
=
1
-
Z2
= ;:0
and
, :::0 =
-
(;: - i)2
4z - I
I
sin Z
23.
Z
,
Zo =
I
-
sinh;:;
2;::.2
Z4 -
Zo =
~
+ ~7T
,
::0
=
:::0 =
0
i
20.
22.
(:: _2
1)4
::0 =
::0 =
;
~.
':::0 = -~7T
24. TEAM PROJECT. Laurent Series. (a) Uniqueness.
Prove that the Laurent expansion of a given analytic
function in a given annulus is unique.
(b) Accumulation of singularities. Does tan (II:)
have a Laurent series that converges in a region
o < Izl < R? (Give a reason.)
(c) Integrals. Expand the following functions in a
Laurent series that converges for Izl > 0:
:::
Z -
,
"
I
2
Z2 2
Zo =
= -;
Z3
12.
sin :::
(z - ~7T)3
(::: -
Z3
I (.
19.
Z2
e
I -
_3
cosh 2;:
-<.
5.
15.
Z
e- z
3.
Find all Taylor and Laurent series with center::
determine the precise regions of convergence.
Z2
l
2. ;: cos-
Z5
Z4 -
TAYLOR AND LAURENT SERIES
(et-I
L--dr,
0
1
-I
Z3
I
Z
0
sin t
--dt.
t
25. CAS PROJECT. Partial Fractions. Write a program
for obtaining Laurent series by the use of partial
fractions. Using the program, verify the calculations in
Example 5 of the text. Apply the program to two other
functions of your choice.
Singularities and Zeros. Infinity
Roughly, a singular point of an analytic function fez) is a ::0 at which f(::) ceases to be
analytic, and a ::ero is a z at which fez) = O. Precise definitions follow below. In this
section we show that Laurent series can be used for classifying singularities and Taylor
series for discussing zeros.
Singularities were defined in Sec. 15.4, as we shall now recall and extend. We also
remember that. by definition, a function is a single-valued relation, as was emphasized
in Sec. 13.3.
We say that a function fez) is singular or has a singularity at a point;:: = Zo if fez) is
not analytic (perhaps not even defined) at z = zo, but every neighborhood of z = Zo
contains points at which fez) is analytic. We also say that z = Zo is a singular point of fez).
We call z = Zo an isolated singularity of fez) if z = Zo has a neighborhood without
further singularities of fez). Example: tan z has isolated singularities at ± 7T12, ±37T/2, etc.;
tan (lIz) has a nonisolated singularity at o. (Explain!)
708
CHAP. 16
Laurent Series. Residue Integration
Isolated singularities of fez) at z
=
Zo can be classified by the Laurent series
(1)
(Sec. 16.1)
valid in the immediate neighborhood of the singular point z
is, in a region of the form
o < Iz -
= zo, except at Zo itself, that
zol < R.
The sum of the first series is analytic at z = zo, as we know from the last section. The
second series, containing the negative powers, is called the principal part of (1), as we
remember from the last section. If it has only finitely many terms, it is of the form
+ ... +
(2)
Then the singularity of fez) at z = Zo is called a pole, and m is called its order. Poles of
the first order are also known as simple poles.
If the principal part of (I) hac; infinitely many terms, we say that fez) has at z = Zo an
isolated essential singularity.
We leave aside nonisolated singularities.
E X AMP L E 1
Poles. Essential Singularities
The function
fez) = z(z -
3
+ (z - 2)2
2)5
has a simple pole at z = 0 and a pole of fifth order at z
singularity at z = 0 are
= 2. Examples of functions having an isolated essential
and
sin -
-
z -
L
n-O (211
~-l)'"
I
+ 1)!in + 1
z
3!Z3
+ 5'
.Z
5
-
+ ....
Section 16.1 provides further examples. For instance, Example I shows that z-5 sin;: has a fourth-order pole
at O. Example 4 shows that l/(z3 - Z4) has a third-order pole at 0 and a Laurent series with infinitely many
negative powers. This is no contradiction, since this series is valid for Izl > 1; it merely tells us that in classifying
singularities it is quite important to consider the Laurent series valid ill tile immediate Ileigllborllood of a singular
•
point. In Example 4 this is the series (I), which has three negative powers.
The classification of singularities into poles and essential singularities is not merely a
formal matter, because the behavior of an analytic function in a neighborhood of an
essential singularity is entirely different from that in the neighborhood of a pole.
E X AMP L E 2
Behavior Near a Pole
fez) = I/z2 has a pole at z = 0, and If(z)1 ~
theorem.
x as ;;; ....... 0 in any manner. This illustrates the foIlowin"
•
eo
SEC. 16.2
709
Singularities and Zeros. Infinity
THEOREM 1
Poles
If f(z) is analytic and has a pole at z =
Zo, then
If(z)1 ~
(Xl
as Z ~
Zo
i17 anv manner.
The proof is left to the student (see Prob. 12).
E X AMP L E 3
Behavior Near an Essential Singularity
The function fez) = el/z has an essential singularity at z = O. It has no limit for approach along the imaginary
axis; it becomes infinite if z ..... 0 through positive real values, but it approaches zero if <: --+ 0 through negative
real values. It takes on any given value c = coia 0 in an arbitrarily small E-neighborhood of;:: = O. To see
the letter. we set z = reill, and then obtain the following complex equation for rand 8. which we must ~olve:
'*
ellz = e<'cos 0 - i sin tJ)/r = cOeia
Equating the absolute values and the arguments, we have e'cos mh· = co' that is
cos8= rlnco,
respectively. From these two equations and cos2 8
and
-sin 8 = ar
+ sin2 8 = r2(ln cO)2 +
and
2
a r2
= I we obtain the formulas
a
tan8= - - - .
Inca
Hence r can be made arbitrarily small by adding multiples of 27T to a, leaving c unaltered. This illustrates the
very famous Picard's theorem (with z = 0 as the exceptional value). For the rather complicated proof. see Ref.
•
[D4J. voL 2. p. 258. For Picard. see Sec. 1.7.
THEOREM 2
Picard's Theorem
If f(z) is analytic alld has all isolated essential singularity at a point zo, it takes Oil
eve I}' value, with at most olle exceptional value, in an arbitrarily small E-neighborhood
oJzo·
Removable Singularities. We say that a function f(::) has a removable sillgulartty at
z = Zo if f(z) is not analytic at z = Zo, but can be made analytic there by assigning a
suitable value f(zo). Such singularities are of no interest since they can be removed as
just indicated. Example: fez) = (sin z)/z becomes analytic at z = 0 if we define f(O) = I.
Zeros of Analytic Functions
A zero of an analytic function fez) in a domain D is a :: = :::0 in D such that f(zo) = O.
A zero has order n if not only f but also the derivatives f', fIt, ... , f n - ll are all 0 at
Z = Zo but fn)(Zo) *- O. A fIrst-order zero is also called a simple zero. For a second-order
zero, f(Zo) = f' (zo) = 0 but f"(zo) *- O. And so on.
E X AMP L E 4
Zeros
The function L + ;::2 has simple zeros at :!:i. The function (1 - -;;4)2 has second-order zeros at:!: I and :!:i. The
function (::: - a)3 has a third-order zero at Z = a. The function eZ has no zeros (see Sec. 13.5). The function
sin z has simple zeros at 0, :!:7T, :!:27T, ... , and sin2 z has second-order zeros at these points. The function
I - cos Z has second-order zeros at 0, :!:27T. :!:47T, ... , and the function (I - cos Z)2 has fourth-order zeros
at these points.
•
CHAP. 16
Laurent Series. Residue Integration
Taylor Series at a Zero. At an nth-order zero ::: = :::0 of f(:::), the derivatives f' (Zo), ..• ,
['n-1)(:::o) are zero, by definition. Hence the first few coefficients (/o, . . . , a n -l of the
Taylor series (1), Sec. 15.4, are zero, too, whereas lin =1= 0, so that this series takes the
form
f(:::)
(3)
= lIn(::
= (z
-
zo)n
+ {/n+l(::
- ::o)n [an
+
- ':0)'1+1
(/n+l(Z - <::0)
+
+ ...
(/n+2(::: -
:::0)2
+ ... ]
This is characteristic of such a zero, because if f(::) has such a Taylor series, it has an
nth-order zero at ::: = :::0' as follows by differentiation.
Whereas nonisolated singularities may occur, for zeros we have
-
THEOREM 3
---------------------------------------------------------------,
Zeros
The zeros of an analytic filllction f(;::) (¥= 0) are isolated; that is, each of them has
a neighborhood that c01l1aills no further :::eros of fez).
I ROO F
The factor (::: - :::0)" in (3) is zero only at ::: = :::0' The power series in the brackets
[ ... ] represents an analytic function (by Theorem 5 in Sec. 15.3), call it g(z). Now
g(Zo) = an =1= 0, since an analytic function is continuous, and because of this continuity,
also g(:::) =1= 0 in some neighborhood of z = :::0' Hence the same holds of f(:::).
•
This theorem is illustrated by the functions in Example 4.
Poles are often caused by zeros in the denominator. (Example: tan z has poles where
cos::: is zero.) This is a major reason for the imp0l1ance of zeros. The key to the connection
is the following theorem, whose proof follows from (3) (see Team Project 24).
--R": • 4
Poles and Zeros
Let fez) be analytic at z = Zo and have a zero of nth order at z = :::0' Then lIf(z)
has a pole of 1I1h order at .: = :::0; and so does h(:::)lf(:::), provided he:::) is allalytic
at Z = 20 and 17(:::0) =1= 0.
Riemann Sphere. Point at Infinity
When we want to study complex functions for large Izl, the complex plane will generally
become rather inconvenient. Then it may be better to use a representation of complex
numbers on the so-called Riemann sphere. This is a sphere S of diameter 1 touching the
complex z-plane at z = (Fig. 369), and we let the image of a point P (a number z in the
plane) be the intersection P* of the segment PN with S, where N is the "North Pole"
diametrically opposite to the origin in the plane. Then to each z there corresponds a point
on S.
Conversely, each point on S represents a complex number z, except for N, which does
not con'espond to any point in the complex plane. This suggests that we introduce an
additional point, called the point at infinity and denoted CG ("infinity") and let its image
be N The complex plane together with :JO is called the extended complex plane. The
complex plane is often called the finite complex plane, for distinction, or simply the
°
SEC. 16.2
711
Singularities and Zeros. Infinity
N
Fig. 369.
Riemann sphere
complex plane as before. The sphere S is called the Riemann sphere. The mapping of
the extended complex plane onto the sphere is known as a stereographic projection.
(What is the image of the Northern Hemisphere? Of the Western Hemisphere? Of a straight
line through the origin?)
Analytic or Singular at Infinity
If we want to investigate a function .fez) for large 1z1, we may now set.;: = 1111" and investigate
.f(z) = .fO/w) == g(w) in a neighborhood of w = O. We define .f(z) to be analytic or singular
at infinity if g(w) is analytic or singular. respectively, at w = O. We also define
(4)
g(O)
= lim
zo->o
g(w)
if this limit exists.
Furthermore, we say that f(z.) has an nth-order zero at infinity if f(l/w) has such a zero
at w = O. Similarly for poles and essential singularities.
E X AMP L E 5
Functions Analytic or Singular at Infinity. Entire and Meromorphic Functions
The function f(z.) = 11z2 is analytic at x since g(w) = f(l/w) = .r 2 is analytic at w = 0, and fez) has a secondorder zero at x. The function .t(;:) = 2 3 is singular at x and has a third-order pole there since the function
3
Z
g(w) = .to/w) = 1Iw has such a pole at w = O. The function e has an essential singularity at Cf) since eV ",
has such a singularity at II' = O. Similarly, cos z and sin z have an essential singularity at x.
Recall that an entire function is one that is analytic everywhere in the (finite) complex plane. Liouville's
theorem (Sec. l4...l) tells us that the only boullded entire functions are the constants, hence any nonconstant
entire function must be unbounded. Hence it has a singularity at x, a pole if it is a polynomial or an essential
singularity if it is not. The functions just considered are typical in this respect.
An analytic function whose only singularities in the finite plane are poles is called a meromorphic function.
Examples are rational function, with nonconstant denominator, tan ;:, cot z, sec z, and eSc z.
•
In this section we used Laurent series for investigating singularities. In the next section
we shall use these series for an elegant integration method .
.... [1-101
SINGULARITIES
Determine the location and kind of the singularities of the
following functions in the finite plane and at infinity. In the
case of poles also state the order.
1. tan 2
7TZ
2. z
+
2
3
3. cot Z2
4.
5. cos z - sin z
6. lI(cos z - sin z)
Z3 e l/(Z-1l
CHAP. 16
712
Laurent Series. Residue Integration
21. (1 - cos Z)2
sin 3z
7.
(Z4 _ 1)4
4
8. - - +
1
Z -
9. cosh [lie
8
2
(z - 1)
2
(;: -
l)3
10. e ll(Z-l)/(e Z -
+ 1)]
1)
11. (Essential singularity) Discuss e llz2 in a similar way
as e llz is discussed in Example 3.
12. (Poles) Verify Theorem I for f(:)
Theorem 1.
113-221
=
:::-3 -
Prove
Z-I.
ZEROS
Determine the location and order of the zeros.
14. (Z4 - 16)4
13. (z + 16i)4
15.
:::-3
17. (3z
2
sin 3
+
l)e-
19. (,2 + 4)(eZ
16.3
16. cosh 2 :::
7fZ
18. (Z2 - 1)2(eZ2
Z
-
-
L)
20. (sin z - 1)3
l)2
23. (Zeros) If f(:) is analytic and has a zero of order 11 at
z = :0' show that f2(Z) has a zero of order 211.
24. TEAM PROJECT. Zeros. la) Derivative. Show that
if f(:) has a zero of order 11 > I at: = :0' then I' (:)
has a zero of order 11 - 1 at ::'0.
(b) Poles and zeros. Prove Theorem 4.
(e) Isolated k-points. Show that the points at which
a nonconstant analytic function fez) has a given value
k are isolated.
(d) Identical functions. If ftC;:) are analytic in a
domain D and equal at a sequence of points Zn in D
that converges in D, show that fl(:) == .f2(::') in D.
25. (Riemann sphere) Assuming that we let the image of
the x-axis be meridians 0° and 180°, describe and
sketch (or graph) the images of the following regions
on the Riemann sphere: (a) Izl > LOO. (b) the lower
half-plane, (c) ! ~ 1::.1 ~ 2.
Residue Integration Method
The purpose of Cauchy's residue integration method is the evaluation of integrals
T.c fez) dz
taken around a simple close path C. The idea is as follows.
If fez) is analytic everywhere on C and inside C, such an integral is zero by Cauchy's
integral theorem (Sec. 14.2), and we are done.
If fez) has a singularity at a point z = Zo inside C, but is otherwise analytic on C and
inside C, then fez) has a Laurent series
fez)
=
2::
an(z - zo)n
+
n~O
z-
Zo
that converges for all points near z = Zo (except at z = Zo itself), in some domain of the
form 0 <
zol < R (sometimes called a deleted neighborhood, an old-fashioned term
that we shall not use). Now comes the key idea. The coefficient hI of the first negative
power lI(z - zo) of this Laurent series is given by the integral formula (2) in Sec. 16.1
with 11 = 1, namely,
1
hI = - 2.
fez) dz.
Iz -
7ft
T.
C
Now, since we can obtain Laurent series by various methods, without using the integral
formulas for the coefficients (see the examples in Sec. 16.1), we can find hI by one of
those methods and then use the formula for hI for evaluating the integral, that is,
(1)
SEC 16.3
713
Residue Integration Method
Here we integrate conunterclockwise around a simple closed path C that contains z
in its interior (but no other singular points of fez) on or inside C!).
The coefficient hi is called the residue of fez) at z = Zo and we denote it by
= Res
hI
(2)
=
Zo
fez).
Z=Zo
E X AMP L E 1
Evaluation of an Integral by Means of a Residue
Integrate the function f(z) = Z-4 sin z counterclockwise around the unit circle C.
Solution.
From (14) in Sec. 15.4 we obtain the Laurent series
sin z
1
I
z
z
z3
Z
-3'z +
.
f(z.) = - 4 - = "3 -
-5' -
.
71.
+ - ...
which converges for Izl > 0 (that is, for all z 1= 0). This series shows that J(z) has a pole of third order at z = 0
and the residue b i = -113!. From (1) we thus obtain the answer
J.
rc
E X AMP L E 2
CAUTION!
z
•
TTi
= 27fib 1 = - ""3
dz
Use the Right Laurent Series!
Integrate f(z) = I/(z3
Solution.
sin z
-4-
Izl
Z4) clockwise around the circle C:
= 112.
= .:3(1 - z) shows that J(z) is singular at z = 0 and z = l. Now z = 1 lies outside C.
Hence it is of no interest here. So we need the residue of ftz) at O. We find it from the Laurent series that
converges for 0 < Izl < l. This is series (I) in Example 4, Sec. 16.1,
z3 - z4
1
1
I
1
---=-+-+
+I+z+'"
Z3 -
l
Z4
Z2
(0
Z
<
Izl <
I).
(Izl >
1),
We see from it that this residue is 1. Clockwise integration thus yields
J.
r
dz
-3--4 =
-27fi Res f(z) = -27fi.
z-o
cZ-z
C4UTlON! Had we used the wrong series (II) in Example 4, Sec. 16.1,
•
we would have obtained the wrong answer, 0, because this series has no power liz.
Formulas for Residues
To calculate a residue at a pole, we need not produce a whole Laurent series, but, more
economically, we can derive formulas for residues once and for all.
Simple Poles.
(3)
Two formulas for the residue of f(:::;) at a simple pole at
Res fez)
= hI = lim
Zo
are
(z - zo)f(z)
Z----7Zo
Z=Zo
and, assuming that f(;:,) = p(z)lq(z), p(zo) =1= 0, and q(z) has a simple zero at Zo (so that
fez) has at;:,o a simple pole, by Theorem 4 in Sec. 16.2),
(4)
Res fez)
Z=20
p(z)
= Res 2=20
q(z)
714
CHAP. 16
PRO 0 F
Laurent Series. Residue Integration
For a simple pole at
z=
Zo the Laurent series (1), Sec. 16.1, is
(0
'*
Here b l
O. (Why?) Multiplying both sides by z the formula (3):
lim (z - ;;;o)f(z)
=
bi
+
lim (z - Zo)[ao
Z-+Zo
Z---i>Zo
':0
< Iz - zol <
and then letting z ~
':0'
R).
we obtain
+ al(Z - zo) + ... ] = b i
where the last equality follows from continuity (Theorem L. Sec. 15.3).
We prove (4). The Taylor series of q(::.) at a simple zero ':0 is
,
2!
f = plq and then f into
Substituting this into
.
q"(zo)
(z - ':o)p(,:)
hm
::'0) - - =
q(z)
Z-+Zo
+
(3) gives
p(z).
Res fez) = lim (z Z~Zo
(.: - zol
+
q(z) = (z - zo)q (zo)
Z-+Zo
(.: -
zo}[q'(;::o)
+ (z
- ::.o)q"(zo)!2
+ ... ]
z - Zo cancels. By continuity, the limit of the denominator is q' (zo) and (4) follows . •
E X AMP L E 3
Residue at a Simple Pole
f(:) = (9:
+ 0/(:3 + ;::) has a simple pole at i because :2 + I
Res
9;:: + i
z~i ;::{;::2
By (4) with
p{i) =
+
= lim (: -
9; + i and
z~;
(5)
9: + i
;::(;::
l/ (;::) ~ 3;::2 +
Res
Poles of Any Order.
i)
z~i
I)
9;;:2 + i
:(;:: + I)
.
I){:: -
+
.
I)
=
= (:
[
+ i )(z
9: + i ]
---.
;::{;:: + I)
-
i). and (3) gives the residue
!Oi
z~i
= -
-2
= -5;.
I we confirm the result.
=
[9;::
+i ]
-2-3~
+ 1
!Oi
Fi
= -
-
2
= -5i.
The residue of fez) at an mth-order pole at Zo is
~~~ fez)
=
1
(m _ I)! !~~o
In particular, for a second-order pole (m
{d"'-I [
]}
dzm-I (z - zoynf(z)
.
= 2),
(5*)
PROOF
The Laurent series of f(z) converging near Zo (except at
where b1n
'* O. The residue wanted is b
l.
Zo
itself) is (Sec. 16.2)
Multiplying both sides by (z - zoyn gives
•
SEC. 16.3
715
Residue Integration Method
We see that hI is now the coefficient of the power (z - ;::0)',,-1 of the power series of
= (z - ::'o),"f(;::). Hence Taylor's theorem (Sec. 15.4) gives (5):
g(;::)
hI
=
(m -
l)!
g'm-ll (:0)
d"'-I
dzm-I
(m - l)!
1
E X AMP L E 4
- - [(7 - 7 )'''f(7)]
""
~o
'.'
•
Residue at a Pole of Higher Order
f(::;) =
(;: +
+ 2::;2 - 7::; + 4) has a pole of second order at ;: = 1 because the denominator equals
1)2 (verify!). From (5*) we obtain the residue
50::;/(::;3
4)(;: -
= lim -d
.~1
d;:
(50;:
-)
z+4
•
200
= - 2 = 8.
5
Several Singularities Inside the Contour.
Residue Theorem
Residue integration can be extended from the case of a single singularity to the case of
several singularities within the contour C. This is the purpose of the residue theorem. The
extension is surprisingly simple.
THEOREM 1
Residue Theorem
Let f(;::) he analytic imide a simple closed path C alld 011 C. except forfillite!y many
singular points ::'1, Z2, •.. , z" inside C. Then the integral of f(z.) taken cO/lIlterclockll"ise
around C equals 27Ti times the sum of the residues of f(;::) {{f Z.I, ••• , Zk:
k
(6)
fefc::,)
d::.
=
27Ti
2:: ~~s fez).
j~I
J
c
Fig. 370.
Residue theorem
716
CHAP. 16
PROOF
Laurent Series. Residue Integration
We enclose each of the singular points Zj in a circle Cj with radius small enough that those
k circles and C are all separated (Fig. 370). Then fez) is analytic in the multiply connected
domain D bounded by C and C h . . . , Ck and on the entire boundary of D. From Cauchy's
integral theorem we thus have
f fez) dz + f fez) dz + f fez) dz + ... + f fez) dz = 0,
(7)
~
C
~
~
the integral along C being taken counterclockwise and the other integrals clockwise (as in
Figs. 351 and 352, Sec. 14.2). We take the integrals over C1 , . . . , Ck to the right and
compensate the resulting minus sign by reversing the sense of integration. Thus,
f fez) dz = f fez) dz + f fez) dz + ... + f fez) dz
(8)
C
C,
C2
Ck
where all the integrals are now taken counterclockwise. By (1) and (2),
f fez) dz = 27Ti
j
Res fez),
~
= 1, ... , k,
z=~
•
so that (8) gives (6) and the residue theorem is proved.
This important theorem has various applications in connection with complex and real
integrals. Let us first consider some complex integrals. (Real integrals follow in the next
section.)
E X AMP L E 5
Integration by the Residue Theorem. Several Contours
Evaluate the following integral counterclockwise around any simple closed path such that (a) 0 and 1 are inside
C, (b) 0 is inside, I outside, (c) I is inside, 0 outside, (d) 0 and I are outside.
r1,C
Solution.
d;:
- Z
;:
The integrand has simple poles at 0 and I, with residues [by (3)]
4-3;: = [4-3;:J
---
Res - - Z~O z(z - 1)
z - I
[Confirm this by (4).] Ans. (a) 21Ti(-4
E X AMP L E 6
4 - 3;:
-2--
Z~O
+
=
4-3z
Res -- =
z~l z(z - I)
-4.
[4-3zJ
--Z
1) = -61Ti, (b) -81Ti, (c) 21Ti, (d) O.
z~l
= I.
•
Another Application of the Residue Theorem
Integrate (tan Z)/(Z2 -
I) counterclockwise around the circle C: Izl = 312.
Solution.
tan;:; is not analytic at ±1T/2, ±31T12, ... , but all these points lie outside the contour C. Because
of the denominator Z2 - I = (z - 1)(z + I) the given function has simple poles at ± I. We thus obtain from
(4) and the residue theorem
f
tanz
-2-C Z - I
dz
=
=
(tanz
21Ti Res
z~l
-2--
Z -
tanzl
21Ti ( 2;:
z~l
1
+
Res
tan;:
+ -
= 21Ti tan 1 = 9.7855i.
~an;:
z~-l
2;::
)
z - I
I )
z~-l
•
SEC. 16.3
717
Residue Integration Method
E X AMP L E 7
Poles and Essential Singularities
Evaluate the following integral, where C is the ellipse 9x 2 +
1 (4::.
Jc
eTrZ
z -
i
= 9 (counterclockwise, sketch it).
+ ze'n'/Z)
16
d::.
4
Solution.
Since::. - 16 = 0 at ±2i and ±2. the first tenn of the integrand has simple poles at ±2i inside
C, with residues [by (4); note that e27Ti = 1]
Res
ze=
1
16
z=-2i
and simple poles at ±2, which lie outside C, so that they are of no interest here. The second term of the integrand
has an essential singularity at 0, with residue 71'2/2 as obtained from
(Izl >
16 +
AilS. 271';(-16 -
2
!71' )
=
7r! 71"2 -
!)i
•
= 30.22 Ii by the residue theorem.
. : c ;
1. Verify the calculations in Example 3 and find the other
residues.
2. Verify the calculations in Example 4 and find the other
residue.
!3:@
1
5.
cos
4+z
z
16.
sin z
Z2
6.
+
Z2 -
I
z
10.
z
12.
Z4 -
l
13. CAS PROJECT. Residue at a Pole. Write a program
for calculating the residue at a pole of any order. Use
it for solving Probs. 3-8.
~
f
c
sin 7rZ
--4-
z
dz,
C:
/z - i/
=
2
dz.
c: Izl =
tan
7rZ
dz.
C: Izl = 2
eZ
--dz.
c:
cos "-
tc
tc
e'
--dz.
fc
tan 7rZ
-_-3-
f
25.
f
c
cos
coshz
3iz
I
= 4.5
C: Iz - il = 1.5
7rZ
Z2 -
Izl
C: Izl = ]
dz,
dz,
C: Iz
<.
+ ~il = I
I - 4::. + 6z 2
(Z2
1.4
C: Izl = I
coth z dz.
24.
RESIDUE INTEGRATION
Evaluate (counterclockwise). (Show the details.)
14.
23.
11 =
Iz -
TTZ
fc
22.
c:
tan
20.
Z2
C: Izl = 1
dz,
d7
tc
1
lfz
sinh !7rz
19.
21.
11. tan
f
f
c
1/3
;:4 -
e
tc
c
8. sec z
7. cot Z
17.
18.
4.~
2
f
c
RESIDUES
Find all the singular points and the corresponding residues.
(Show the details of your work.)
3.
15.
+ !){2
0).
- z) dz.
- 23z + 5
(2z
])2(3z - 1) dz,
c
c: /z/
= I
30::. 2
C: /z/ = I
CHAP. 16
718
16.4
Laurent Series. Residue Integration
Residue Integration of Real Integrals
It is quite surprising that certain classes of complicated real integrals can be integrated
by the residue theorem, as we shall see.
Integrals of Rational Functions of cos () and sin ()
We first consider integrals of the type
(1)
J
=
f
2 ...
e, sin e) de
F(cos
o
where F(cos e, sin 6) is a real rational function of cos e and sin e [for example,
(sin2 e)/(5 - 4 cos e)] and is finite (doe5. not become infinite) on the interval of integration.
Setting eill = z, we obtain
(z + +)
(2)
(z - +)
Since F is rational in cos e and sin e, Eq. (2) shows that F is now a rational function of
;;;, say, f(.:). Since d;;;lde = ieill, we have de = cl.:/i;:. and the given integral takes the form
= J. f(::) ~.:
Jc
IZ
J
(3)
and, as e ranges from 0 to 27T in (I), the vaIiable z = eil! ranges counterclockwise once
around the unit circle Izi = 1. (Review Sec. 13.5 if necessary.)
E X AMP L ElAn Integral of the Type (1)
Show b} the pre,ent method that
Solution.
J,-271"
o
We use cos fJ = ~(:
de
V2 -
cos
+ 1/:) and de
=
e = 17T.
= d:/i::..
f
Then the integral becomes
d::.
i
c __
(_2- 7V2 -+
2
..
I)
~
= -
2 J.
i Jc
d-
(::. - V2 -
1)(;: -
V2 +
1) .
We see that the integrand has a simple pole at ::'1 = V2 + I outside the unit circle C. so that it is of no interest
here. and another simple pole at::2 = '\ '2 - I (where::. - V2 + I = 0) inside C with residue [by (3), Sec. 16.3]
z~~~ (~- V2 - I):Z - V2 + I)
= [ Z -
~
-
I
1~V'2-1
2
Answer: 27Ti( -2/i)( -1/2) = 27T. (Here -21i is the factor in front of the last integral.)
•
SEC. 16.4
719
Residue Integration of Real Integrals
As anOlher large class, let us consider real integrals of the form
IX f(x) dx:.
(4)
-x
Such an integral, whose interval of integration is not finite is called an improper integral,
and i( has the meaning
I
(5')
X
-x
f(x) dx
= lim
a_-:c
I
0
f(x) dx:
a
+ b-----')ox
lim
f
b
f(x) dt.
0
If both limits exist, we may couple (he (wo independent passages
I
(5)
DO
f(x) dx = lim
R-----')occ
-00
I
and
(0 - 0 0
x.
and write
R
f(x) dx.
-R
The limit in (5) is called the Cauchy principal value of the integral. It is written
I=
pr. v.
f(x) dx.
-x
It may exist even if the limits in (5') do not. EXllmple:
lim
R~x
I
R
x dol = lim
R~x
-R
(R2
2
R2)
- -
= 0,
2
but
lim
b_x
f
b
0
xdx
=
x.
We assume that the function f(t) in (4) is a real rational function whose denominator
is different from zero for all real x and is of degree at least (Wo units higher than the
degree of (he numerator. Then the limits in (5') exist. and we may start from (5). We
consider the corresponding contour integral
f
(5*)
fez) d::.
c
around a path C in Fig. 371. Since .f(x) is rational, fez) has finitely many poles in the
upper half-plane, and if we choose R large enough, then C encloses ali these poles. By
the residue theorem we then obtain
fc
f(:o d::.
f
= f(::.) d::. +
s
I
R
f(x) dx
= 27Ti
2: Res f(::.)
-R
Yj
_L,T\
-R
Fig. 371.
I
R
x]
Path C of the contour integral in (5*)
720
CHAP. 16
Laurent Series. Residue Integration
where the sum consists of all the residues of f(z) at the points in the upper half-plane at
which f(z) has a pole. From this we have
I
(6)
R
f(x) dx
-R
= 27Ti ~ Res f(z)
-
I
f(z) dz.
S
We prove that, jf R --') x, the value of the integral over the semicircle S approaches
zero. If we set.: = Rei/!, then S is represented by R = const, and as z ranges along S. the
variable ranges from 0 to 7T. Since. by assumption, the degree of the denominator of
f(z) is at least two units higher than the degree of the numerator, we have
e
k
(Izl = R >
If(z)1 < Izl2
Ro)
for sufficiently large constants k and Ro. By the ML-inequality in Sec. 14.1,
II I
s
f(z) dz
k
< '2
R
TTR
k7T
=-
R
Hence, as R approaches infinity. the value of the integral over S approaches zero. and (5)
and (6) yield the result
{>O f(x) dx = 27Ti ~ Res f(z)
(7)
-00
where we sum over all the residues of f(::.) at the poles of f(z) in the upper half-plane.
E X AMP L E 2
An Improper Integral from 0 to
00
Using (7), show that
d-.:
oe
J
o
1+
x4
7T
=
2\12 .
y
x
Fig. 372.
Solutioll.
Indeed. fez) = 11(1 +
Zl =
e ....iJ4~
has four simple poles at the poims Imake a sketch)
Z4)
"7
Example 2
_
"'2 -
e
3wiJ4
,
The first two of these poles lie in the upper half-plane (Fig. 372). From (4) in the last section we find the residues
SEC. 16.4
721
Residue Integration of Real Integrals
.
[
Res fez)
=
Z~Zl
Res fez) = [
Z~Z2
1
(1
+
(1
+
4 ,
]
Z~Zl
z)
1
4' ]
z)
Z~Z2
[
=
1 ]
_ 1 -3"';/4 -_ - -1 e rri/4.
- - e
4z:
Z~Zl
4
4
~
[~J
=
4z
Z~Z2
=
e- 97Ti/ 4 =
..!..
4
..!..
4
e- wi/ 4 •
(Here we used e"'; = -I and e -2"'; = 1.) By (1) in Sec. 13.6 and (7) in this section,
f
cc
dr:
27Ti
'/4
'/4
27Ti
7T
7T
-ro 1 + x4 = - 4 (e= - e--m ) = - 4 '2i'sin '4 = 7Tsin '4 =
7T
V2 .
Since 1/(1 + x 4) is an even function, we thus obtain, as asserted,
•
7T
2V2 .
Fourier Integrals
The method of evaluating (4) by creating a closed contour (Fig. 371) and "blowing it up"
extends to integrals
(8)
fro f(x) cos sx dx
fro f(x) sin sx
and
tb
(s real)
-cc
-00
as they occur in connection with the Fourier integral (Sec. 11.7).
If f(x) is a rational function satisfying the assumption on the degree as for (4), we may
consider the corresponding integral
f
c
fez) e
isz
(s real and positive)
dz
over the contour C in Fig. 371 on p. 719. Instead of (7) we now get
f=-rof(x)e
(9)
isx
dx
= 27Ti ~
Res [f(z)e isZ ]
(s
>
0)
where we sum the residues of f(z)e isz at its poles in the upper half-plane. Equating the
real and the imaginary parts on both sides of (9), we have
foo f(x) cos sx dx =
-27T
~ 1m Res [f(z)eisZ ],
-00
(10)
(s> 0)
fro f(x) sin sx d>o:
-co
= 27T ~ Re Res [f(z)e isZ ].
To establish (9), we must show [as for (4)] that the value of the integral over the
semicircle S in Fig. 371 approaches 0 as R -7 00. Now s > 0 and S lies in the upper
half-plane y ~ O. Hence
(s> 0,
y
~
0).
From this we obtain the inequality /f(z)e isz / = /f(z)//e isz / ~ /f(z)/ (s > 0, y ~ 0). This
reduces our present problem to that for (4). Continuing as before gives (9) and (10). •
722
E X AMP L E 3
CHAP.16
Laurent Series. Residue Integration
An Application of (10)
I""
Show that
-x
cos SX
-2 - 2
k
+x
d~=
7T
e
-
I
-ks
k
cc
SIn.1X
- 2 - - 2 d~=O
-0:;
k +x
(S
>
O. k
In fact, e isz/(k 2 + ;:2) has only one pole in the upper half·plane, namely. a simple pole at::
and from (4) in Sec. 16.3 we obtain
Solution.
e
isz
Res - - z~ik k 2 + Z2
[
=
e
isz
]
-
z~ik
2::
>
0).
=
ik.
e -ks
= -.- .
2fk
Thus
I
isx
:x:
-cc
Since eisx
=
-ks
e . e
J... 2 + x 2 d-.: = 27Tf 2ik
-ks
7T
=
k
e
.
cos sx + ; sin nc. this yield_ the above results lsee also (5) in Sec. 11.7.]
•
Another Kind of Improper Integral
We consider an improper integral
I dx
B
(11)
f(x)
A
whose mtegrand becomes infinite at a point a in the interval of integration.
=
lim If(x)1
x __ a
00.
By definition. this integral (11) means
B
(12)
I f(x) d"l:
B
Q-E
=
A
lim
E_O
I f(x) dx + lim I f(x) dx
~-O
A
a+~
where both E and TJ approach zero independently and through positive values. It may happen
that neither of these two limits exists if E and TJ go to 0 independently, but the limit
lim
(13)
E_O
[IO-;(x) dx + IB f(x) dX]
A
a+E
exists. This is called the Cauchy principal value of the integral. It is written
pro
V.
I
B
f(x) dL
A
For example,
pI.
V.
II {~~
-1 .t
=
E
lim
[I-
E_O
-1
dr +
x3
II xdx ]
E
3
= 0:
the principal value exists, although the integral itself has no meaning.
In the case of simple poles on the real axis we shall obtain a formula for the principal
value of an integral from -00 to 00. This formula will result from the following theorem.
SEC. 16.4
723
Residue Integration of Real Integrals
THEOREM 1
Simple Poles on the Real Axis
If f(:::;) has a simple poLe at z = a on the real axis, then (Fig. 373)
lim
7---+0
I
C
= 7ri Res fez).
fez) dz
2=a
2
0,
a-r
PROOF
a+r
a
Fig. 373.
x
Theorem 1
By the definition of a simple pole (Sec. 16.2) the integrand fez) has for 0 <
the Laurent series
f(:::;) =
:::;-a
+ gC:),
bi
Iz - al <
R
= Res fez).
2=a
Here g(z) is analytic on the semicircle of integration (Fig. 373)
and for all z between C 2 and the x-axis, and thus bounded on C2 , say, Ig(z) I ~ M. By
integration.
I
C2
f(:::;) d:::;
= f7i"
b!e ireiB dB
ore
+
I
g(;;::) d:::;
=
b I 7ri
C2
+
I
g(:::;) dz.
C2
The second integral on the right cannot exceed M7rr in absolute value. by the ML-inequality
(Sec. 14.1). and ML = M7rr~ 0 as r~ o.
•
Figure 374 shows the idea of applying Theorem l to obtain the principal value of the
integral of a rational function f(x) from -:lJ to:xl. For sufficiently large R the integral over
the entire contour in Fig. 374 has the value J given by 27ri times the sum of the residues
of f(:::;) at the singularities in the upper half-plane. We assume that f(x) satisfies the degree
Fig. 374.
Application of Theorem 1
724
CHAP. 16
laurent Series. Residue Integration
condition imposed in connection with (4). Then the value of the integral over the large
semicircle S approaches 0 as R ~ x. For r ~ 0 the integral over C2 (clockwise!)
approaches the value
K
=
7Ti Res f(z)
-
2=a
by Theorem I. Together this shows that the principal value P of the integral from -00 to
00 plus K equals J; hence P = J K = J + 7Ti Resz~a f(z). [f f(z) has several simple
poles on the real axis, then K will be -7Ti times the sum of the corresponding residues.
Hence the desired formula is
pro
(14)
V.
I
oc
f(x) dx = 27Ti
-oc
L
+
Res f(z)
7Ti
L
Res f(z)
where the first sum extends over all poles in the upper half-plane and the second over all
poles on the real axis, the latter being simple by assumption.
E X AMP L E 4
Poles on the Real Axis
Find the principal value
d,
x
pr.
Solutioll.
V.
I_=
+
(x2 _ 30t
2)(x2
+
I)
Since
x 2 - 3x
the integrand f(x), considered for complex
z=
~,
+ 2
= (x -
has simple poles at
Res f(:d =
I,
I )(x - 2),
z~l
1
[
2)(~2 + I)
(z -
]
z~l
2 '
z=
2.
Res f(::) = [
z~2
12
1)(::
(:: -
]
+ I)
z~2
I
5 '
:: =
Resj(::) =
i,
z~i
[2
(::
-
3.;:
1
+
2)(;:
+ i)
]
z~i
3 - i
=6+2;=
20'
and ar .: = -; in the lower half-plane, which is of no interest here. From (14) we get the answer
pro
V.
I
d,
x
-x
(x2 _ 3x
+
2)(x 2
(
+ I)
= 2wi
3- i) + wi (1
w-"2 +
I)
5"
W
=
10
•
More integrals of the kind considered in this section are included in the problem set. Try
also your CAS, which may sometimes give you false results on complex integrals.
SEC. 16.4
.... -.-..
1 ...1.1 .......
__ ...
11-81
725
Residue Integration of Real Integrals
_-=_
. _.......
INTEGRALS INVOLVING COSINE AND SINE
Evaluate the following integrals. (Show the details of your
work.)
1.
3.
r~
o
t~
o
r
dO
dO
0
7.
f
t"
o
dO
o 2 + cos 0
t
23.
I
oo
dO
8 - 2 sin 0
25.
I
6.
(!
t~
o
sin 2 0
5 - 4 cos
(!
x+2
dx
-3--
-cc
w
0
dO
5 - 4 sin
oo
x + x
24.
I
2
:>0
_:>0
x+5
-!--dx
x - I
dx
-3--
x - x
dO
26.
cos fJ
-----dO.
13 - 12 cos 20
Him. cos 20 =
8.
IW
IMPROPER INTEGRALS:
POLES ON THE REAL AXIS
Find the Cauchy principal value (showing details):
-00
27<
o
-'.
37 - 12 cos 0
w
5.
2.
7 + 6 cos 0
123-271
27.
CC
dx
-00
-x-;;4-+-3-x"""2---4
I
L:
2
+4cosO
dO
17 - 8 cos (!
28. TEAM PROJECT. Comments on Real Integrals.
(a) Formula nO) follows from (9). Give the details.
2
(b) Use of auxiliary results. Integrating e- z around
IMPROPER INTEGRALS:
INFINITE INTERVAL OF INTEGRATION
the boundary C of the rectangle with vertices -a, a,
+ ib, -a + ib, letting a --> co, and using
a
Evaluate (showing the details):
9.
tl.
L:
I:
x
13.
15.
I
""
I
-00
dx
x6
18.
I
I
I
dx
-c""C
O--+-)""""2
x2
4
co
-x
14.
X3
---8
x2
x
+x
+
+
dx
I
-4--
x
+
16.
I
+
x
(x 2
I
dx
2x
-
+ 5)2
L: -x-'-4-~-x-I-6
L: -(X-;2;-+-1-~-;X"""2-+-9-)
dx
I
cosx
=
-00
22.
I
-4--
-x
20.
+
12.
L: -(X-,.2;:----;-~-+-2-,)2;:x
t' ~dx
L:
-x
ell:
-"" 1
17.
10.
x2 +
dx
19.
I
I
""
_::>0
dx
cos 4x
-x"O;"4-+-5-'x2::--+-4 dx
21.
.
smx
-4--
co
-00
X
+
1
sin 3x
-4--
x
+ 1
dx
dx
show that
""
L
2
-v:,;
e- X cos 2bx dx = - - e
o
2
_b 2
.
(This integral is needed in heat conduction in Sec.
12.6.)
(c) Inspection. Solve Probs. 15 and 21 without
calculation.
29. CAS EXPERIMENT. Check your CAS. Find out to
what extent your CAS can evaluate integrals of the
form (1), (4). and (8) correctly. Do this by comparing
the results of direct integration (which may come out
false) with those of using residues.
30. CAS EXPERIMENT. Simple Poles on the Real
Axis. Experiment with integrals f~co f(x) dx.
f(x) = [(x - al)(x - a2) ... (x - ak)r 1. aj real and
all different, k > L Conjecture that the principal value
of these integrals is O. Try to prove this for a special
k, say, k = 3. For general k.
726
CHAP. 16
Laurent Series. Residue Integration
= .1
1. Laurent series generalize Taylor senes. Explain the
details.
S T ION SAN 0
21.
c~sh 5z
z +
k-
, C:
il = 2
4
2. Can a function have several Laurent series with the same
center? Explain. If your answer is yes, give examples.
22.
3. What is the principal part of a Laurent series? Its
significance?
23. cot 8;:., C: 1:.::1 = 0.2
4. What is a pole? An essential singularity,? Give
examples.
5. What is Picard's theorem? Why did it occur in this
chapter?
4z
3
+ 7z
cos Z
z'n
,11
k+
, C:
cos ;:.
,:2 sin z
24. 4_2 _ 1 ,C:
25.
PRO B L EMS
Iz -
11 = 1
11 = 2
I .,.,
2 ... ., C-.....
H
=
6. What is the Riemann sphere? The extended complex
plane? Tts significance?
26.
7. Is e lk2 analytic or singular at infinity? cosh;:.? (;:. - 4)3?
Explain.
27.
15z +9
, C:
Z3 - 9z
Iz - 31
28.
15;:. +9
, C:
_3 _
9z
Izi
Z2
8. What is the residue? Why is it important?
9. State formulas for residues from memory.
10. State some further methods for calculating residues.
+
;;.2 _
1
1 2
+ v2 =
C: -x
2
.
2;:.
= 2
=4
11. What is residue integration? To what kind of complex
integrals does it apply?
129- 35 1
12. By what idea can we apply residue integration to real
integrals from -x to x,? Give simple examples.
Evaluate by the methods of this chapter (showing the
details):
13. What is a zero of an analytic function? How are zeros
classified?
29.
15. Can the residue at a singular point be O? At a simple
pole'?
16. What is a meromorphic function? An entire function?
Give examples.
COMPLEX INTEGRALS
117-281
30.
17.
31.
32.
z
c: Izl
18. ~,
+
i ' C:
iz+
Z2 _
i;:.
34.
c: 1::1
IOz
20.
[:
L
k-
+2 '
C:
2il = 3
/z -
1/ = 3
35.
f~
e
de
k + cos e , k >
1
de
1 - ~ sin
sin
e
e
3
+ cos e
(l
+ x 2 )2
x
de
dx
dx
::>0
sin 2;:.
19. 2;:.
25 - 24 cos
{7T
0
= 1
de
{7T
0
33.
-4 '
f7T
0
Integrate counterclockwise around C. (Show the details.)
tan ;:.
[7T
0
14. What are improper integrals? Cauchy principal values?
Give examples.
REAL INTEGRALS
(1
+
2
X )2
+ 2X2
+ 4X4
dr
36. Obtain the answer to Prob. 18 in Sec. 16.4 from the
present Prob. 35.
727
Summary of Chapter 16
Laurent Series. Residue Integration
A Laurent series is a series of the form
(Sec. 16.1)
(I)
or, more briefly written [but this means the same as (1)!]
I
f(:;;*)
on = - - ,(
n+l dz*
27Ti Jc (z* - Zo)
(1 *)
n=-oo
where n = 0, ± I, ±2, .... This series converges in an open annulus (ring) A with
center Zoo In A the function fez) is analytic. At points not in A it may have
singularities. The first series in (1) is a power series. In a given annulus, a Laurent
~eries of fez) is unique. but fez) may have different Laurent series in different annuli
with the same center.
Of particular importance is the Laurent series (I) that converges in a neighborhood
of Zo except at:;;o itself, say, for 0 < Iz - ':01 < R (R > 0, suitable). The series (or
finite sum) of the negative powers in this Laurent series is called the principal part
of fez) at Zo. The coefficient hI of 1/(:;; - zo) in this series is called the residue of
f(::) at Zo and is given by [see (1) and (1 *)]
(2) hI = Res fez)
Z~Zo
=
f
_1_.
f(z*) dz*.
27T1 C
Thus
,( f(z"') d.:* = 27Ti Res fez).
~
Z=~
hI can be used for integration as shown in (2) because it can be found from
~~~ fez) =
(3)
(m
~
I)!
!~~o (~:~11 [(z -
zo)'''f(:;;)]),
(Sec. 16.3),
provided f(z.) has at Z.O a pole of order m; by definition this means that that principal
part has 1/(z - zo)'n as its highest negative power. Thus for a simple pole (111 = 1),
Res fez) = lim (z - ':o)f(z);
Z=Zo
Z~Zo
also,
p(.:)
Res - -
2~2U
q(.:)
p(zo)
= -,--.
q (zo)
If the principal part is an infinite series, the singularity of fez) at Zo is called an
essential singularity (Sec. 16.2).
Section 16.2 also discusses the extended complex plane, that is, the complex plane
with an improper point x (,'infinity") attached.
Residue integration may also be used to evaluate certain classes of complicated
real integrals (Sec. 16.4).
CHAPTER
17
Conformal Mapping
If a complex function w = f(~) is defined in a domain D of the ~-plane, then to each point
in D there corresponds a point in the lI'-plane. In this way we obtain a mapping of D onto
the range of values of .f(::;) in the w-plane. We shall see that if f(::;) is an analytic function,
then the mapping given by tv = fez) is conformal (angle-preserving), except at points
where the derivative (z) is zero.
t'
Conformality appeared early in history in connection with constructing maps of the
globe, which can be conformal (can give directions correctly) or "equiareal" (give areas
correctly, except for a scale factor). but cannot have both properties, as can be proved
(see [GR8] in App. 1).
Conformality is the most important geometric property of analytic functions and gives
the possibility of a geometric approach to complex analysis. Indeed, just as in calculus
we use curves of real functions y = f(x) for studying "geometric" propelties of functions,
in complex analysis we can use conformal mappings for obtaining a deeper understanding
of properties of functions, notably of those discussed in Chap. 13.
Indeed. we shall first define the concepts of conformal mapping and then consider
mappings by those elementary analytic functions in Chap. 13.
This is one purpose of this chapter. A second purpose, more important to the engineer
and physicist, is the use of conformal mapping in connection with potential problems. In
fact, in this chapter and in the next one we shall see that conformal mapping yields a
standard method for solving boundary value problems in (two-dimensional) potential
theory by transforming a complicated region into a simpler one. Corresponding
applications will concern problems from electrostatics, heat flow. and fluid flow.
In the last section (17.5) we explain the concept of a Riemann surface, which fits well
into the present discussion of "geometric" ideas.
Prerequisite: Chap. 13.
Sections that may be omitted ill a shorter course: 17.3 and 17.5
References and Answers to Problems: App. I Part D, App. 2.
728
729
SEC. 17.1
Geometry of Analytic Functions: Conformal Mapping
17.1
Geometry of Analytic Functions:
Conformal Mapping
A complex function
(I)
w
= fez) = u(x. y) + iv(x. y)
(z
=
x
+
iy)
of a complex variable z gives a mapping of its domain of definition D in the complex
~-plane illto the complex w-plane or Ol1to its range of values in that plane. l For any point
Zo in D the point Wo = f(zo) is called the image of z{) with respect to f. More generally,
for the points of a curve C in D the image points form the image of C; similarly for other
point sets in D. Also, instead of the mapping by a fUllction w = f(z) we shall say more
briefly the mappillg w = fez).
EXAMPLE 1
Mapping w = I(z) =
Z2
Using polar forms z ~ ,.eiH and w ~ Rei</>, we have w ~ ;:2 = ,.2e2ifl. Comparing moduli and arguments
gives R = ,.2 and cf> = 20. Hence circles r = "0 are mapped onto circles R = "02 and rays 0 = 00 onto rays
cf> = 200 , Figure 375 shows this for the region I ~ Izl ~ 3/2. rr/6 ~ 0 ~ TT/3. which is mapped onto the region
I ~
Iwl ~ 9/4. TT/3
~ 0 ~ 2TT13.
In Cartesian coordinates we have :::
Hence vertical lines x =
obtain y2 = c 2 - u and
~
x
+
iy and
C = COilS! are mapped onto 1I =
2
2
= 4c y2. Together,
c2
-
y2, V = 2cy.
From this we can eliminate
y.
We
v
(Fig. 376).
~
These parabolas open to the left. Similarly. hmizuntallines y
to the right.
k
= COliS! are mapped onto parabolas opening
(Fig. 376). •
v
I
\
\
I
I
\
/
\
y
I
\
\
/
2
/
/
/
/
x
u
(z-plane)
Fig, 375.
Mapping w =
(w-plane)
Z2.
Lines
Izi = const, arg z = const and their images in the w-plane
IThe general terminology is as follows. A mapping of a set A into a set B is called surjective or a mapping
of A onto B if every element of B is the image of at least one element of A. It is called injective or one-to-one
ifdi~erent elements ot A have different images in B. Finally, it is called bijective if it is both sUljective and
mJeclIve.
CHAP. 17
730
Conformal Mapping
v
y=2
y=l
\
-5
/
/
x=2
Fig. 376.
Images of x
=
const, Y
=
const under w
= Z2
Conformal Mapping
A mapping w = fez) is called conformal if it pre~erves angles between oriented curves
in magnitude as well as in sense. Figure 377 shows what this means. The angle
a (0 ~ a ~ 7T) between two intersecting curves C 1 and C2 is defined to be the angle
between their oriented tangents at the intersection point z{). And conformalit), means that
the images C 1 * and C2 * of C1 and C2 make the same angle as the curves themselves in
both magnitude and direction.
THEOREM 1
Conformality of Mapping by Analytic Functions
The mapping w = f(:;:') by an a1lalytic jimctioll f is confonnal, except at critical
points, that is, poiTlts at which the derivative f I is zero.
PROOF
n· = :2 has a critical point at z = O. where
(see Fig. 375), so that conformality fails.
The idea of proof is to consider a curve
(2)
f' (z)
C: z(1) = x(t)
= 2z
= 0 and the angles are doubled
+ i)'(t)
in the domain of fez) and to show that w = .Hz) rotates all tangents at a point Zo (where
0) through the same angle. Now z(1) = dzldt = .i(1) + i .\i(t) is tangent to C in
(2) becau'ie this is the limit of (::1 - zo)/!.lt (which has the direction of the secant 21 - ::0
f' (zo) "*
-(z-plane)
(w-plane)
Fig. 377. Curves C1 and C2 and their respective images
ct and C2* under a conformal mapping w = [(z)
SEC. 17.1
731
Geometry of Analytic Functions: Conformal Mapping
in Fig. 378) as ZI approaches Zo along C. The image C* of C is w = f(z(1)). By the chain
rule, Ii· = t' (z(t»z(t). Hence the tangent direction of C* is given by the argument (use
(9) in Sec. 13.2)
(3)
w= arg t'
arg
+ arg z
z
where arg gives the tangent direction of C. This shows that the mapping rotates all
directions at a point Zo in the domain of analyticity of f through the same angle arg f' (:::0),
which exists as long as f' (zo)
O. But this means conformality, as Fig. 377 illustrates
for an angle a between two curves. whose images C1 * and C2 * make the same angle
•
(because of the rotation).
"*
I.
CurveC
Tangent
/
Fig. 378.
Secant and tangent of the curve C
In the remainder of this section and in the next ones we shall consider various conformal
mappings that are of practical interest, for instance, in modeling potential problems.
E X AMP L E 2
Conformality of w = zn
.
The mapping w = zn, n = 2,3, ... , is conformal, except at Z = 0, where ,/ = llZn-l = O. For n = 2 this is
shown in Fig. 375: we see that at 0 the angles are doubled. For general n the angles at 0 are multiplied by a
factor II under the mapping. Hence the sector 0 :;" e :;" 'Trln is mapped by ;::n onto the upper half·plane u ~ 0
~~.
x
Fig. 379.
E X AMP L E 3
Mapping w
u
Mapping by w
= zn
= z + 1/z. Joukowski Airfoil
In terms of polar coordinates this mapping is
tv = tl
1
-t iu = r(cos fI -t i sin fI) -t - (cos fI - i sin fI).
r
By separating the real and imaginary parts we thus obtain
l/
=
a cos
e,
u=bsinfl
*
where
a=r+
Hence circles Izl = r = const
I are mapped onto ellipses x 1a
onto the segment -2 :;" u :;" 2 of the u-axis. See Fig. 380.
2
2
r
r
+ ilb =
2
l. The circle r
=
1 is mapped
732
CHAP. 17
Conformal Mapping
v
y
u
Fig. 380.
Example 3
Now the derivative of w is
(::: +
1)(;:: -
1)
_2
which is 0 at Z = ± I. These are the points at which the mapping is not conformal. The two circles in Fig. 381
pas, through z = -I. The larger is mapped onto a Jo"kowski ui/.foi/. The dashed circle passes through both -I
and I and is mapped onto a curved segment.
Another interesting application of w = Z + liz lthe flow around a cylinder) will be considered in Sec. 18.4. •
y
,
J
' "\ C
/
II
I
--0--
x
2
Fig. 381.
E X AMP L E 4
Conformality of w = e
u
Joukowski airfoil
Z
From (10) in Sec. 13.5 xwe have lezi = eX and Arg::: = y. Hence e Z maps a vertical straight line x = Xo = COllst
onto the circle Iwl = e 0 and a horizontal straight line)" = )"0 = CO/1St onto the ray arg ". = )"0. The rectangle
in Fig. 382 i, mapped onto a region bounded by circles and rays as shown.
The fundamental region -71 < Arg;:: ~ 71 of eZ in the :::-plane is mapped bijectively and conformally onto
the entire w-plane without the origin w = 0 (because e Z = 0 for no :::). Figure 383 shows that the upper half
o < y ~ 71 of the fundamental region is mapped onto the upper half-plane 0 < arg w ~ 71. the left half being
mapped inside the unit disk Iwl ~ 1 and the right half outside (why"!).
•
J
Y~~~~l_
1 D_____ C
05 _____
. A
B
oo
1
X
Fig. 382.
-3
Mapping by w = e Z
y
1t
o
-1
(z-planeJ
0
(w-plane)
Fig. 383.
Mapping by w = e Z
u
SEC. 17.1
733
Geometry of Analytic Functions: Conformal Mapping
E X AMP L E 5
Principle of Inverse Mapping. Mapping w
= Ln z
= f-\w) of w = f(::) is obtained by illterchlillging the roles of the
z-p/ane and the w-p/ane in the mapping by II' = 1(:;;).
Now the principal value w = f(::) = Ln :;; of the natural logarithm has the inverse z = [-1(11") = eW • From
Example -=I (with the notations::: aud lI" interchanged!) we know that [-\w) = e W maps the fundamental region
of the exponential function onto the :;;-plane without:;; = 0 (becau~e eW 0 for every w). Hence II" = fl:;;) = Ln :;;
maps the ::-plane without the origin and cut along the negative real axis (where (j = 1m Ln::: jumps by 21T)
Principle. The mapping by the inverse z
*
confonnally onto the horizontal strip -1T < V :§ 1T of the II"-plane. where w = II + iv.
Since the mapping If = Ln::: + 21Ti differs from w = Ln z by the translation 21Ti (veI1icalIy upward). this
function maps the z-plane (cut as before and 0 omitted) onto the strip 1T < V :§ 31T. Similarly for each of the
infinitely many mappings II' = In:;; = Ln:: :':: 21l1Ti (11 = O. I. 2 .... ). The corresponding horizontal strips
of width 21T (images of the :;;-plane under these mappings) together cover the whole w-plane without
overlapping.
•
Magnification Ratio.
By the definition of the derivative we have
lim If(Z) - .f(zo)
(4)
Z -
Z~Zu
1
=
If' (20)1·
Zo
Therefore, the mapping w = f(:::.) magnifies (or shortens) the lengths of short lines by
approximately the factor
(zo)l. The image of a small figure conforms to the original
figure in the sense that it has approximately the same shape. However, since (z) varies
from point to point, a large figure may have an image whose shape is quite different from
that of the original figure.
More on the Condition f'(z) -=I=- O. From (4) in Sec. 13.4 and the Cauchy-Riemann
equations we obtain
If'
(5')
f'
au + i au
- 12
If ,(z) 12 = 1-.
ax
ax
( ~1l)2
a.l
+ (~U)2
all
au
ax
iJy
au
au
ax
ay
ax
all
au
au au
a.l
ay
ay
ax
that is,
(5)
If' (z)1 2
a(lI, u)
---
=
a(x, y)
This determinant is the so-called Jacobian (Sec. 10.3) of the transformation w = f(z)
written in real form u = u(x, y), u = u(x, y). Hence (zo) =1= 0 implies that the Jacobian
is not 0 at ::0' This condition is sufficient that the mapping w = f(z) in a sufficiently small
neighborhood of:::.o is one-to-one or injective (different points have different images). See
Ref. [GR4] in App. 1.
f'
==== -... =
~.
=".-.--==-
1. Verify all calculations in Example I.
14-6/
2. Why do the images of the curves /;;:1 = COIlsl and
arg :: = COllst under a mapping by an analytic function
f(:) intersect at right angles, except at points at which
f'(:) = O?
Find and sketch or graph the image of the given curves
under the given mapping.
4. x = I. 2. 3, 4, y = I, 2, 3, 4; w = :2
3. Doe, the mapping w = Z = x - iy preserve angles in
size as well as in sense?
MAPPING OF CURVES
5. Curves as in Prob. 4, w = iz (Rotation)
6.
Izl
=
1/3.112. 1,2,3; Arg z = 0, ::'::17/4, ::'::1712, ::'::31712,
::'::17; w = liz
CHAP. 17
734
17-151
Conformal Mapping
11?,-23I
FAILURE OF CONFORMALITY
Find all points at which the following mappings are not
conformal.
17. ;:(Z4 - 5)
18. ~2 + 1/;:2
MAPPING OF REGIONS
Find and sketch or graph the image of the given region
under the given mapping.
7. -7T/4 < Arg z < 7T/4.
8. x ~ 1, \I' = liz
Izl
< 1/2.
w =
Z3
19. cos
9. Izl > I, W = 3~
10. 1m ~ > O. II" = I
11. x ~ 0, y ~ 0, 1;:1 ~ 4; w = Z2
12. -1 ~x~ 1,-7T<Y<7T:H'=ez
13. In 3 < x < In 5, lI" = e'
14. -7T < Y ~ 37T. II" = e Z
15. 2 ~ 1:::1 ~ 3, 7T/4 ~ e ~ 7T/2; w = Ln ~
21. ;:2
23.
=
Z4, (b) .Hz)
=
a)3. (.;::3 -
22.
exp
(Z5 -
80;:)
a)2
MAGNIFICATION RATIO, JACOBIAN
Find the magnification ratio M. Describe what it tell, you
about the mapping. Where is M equal to I? Find the
Jacobian J.
24. w = !Z2
25. w= e Z
26.
=
11;::, (c) f(z)
Il"
27. w= Lnz
= Z3
l/~
28. 11'=
29. Magnification of Angles. Let f(:::) be analytic at ;:0'
Suppose that .f' (zo) = O•... , tlc-IJ(~o) = O. Then
the mapping lI" = f(z) magnifies angles with vertex at
~o by a factor k. Illustrate this with examples for
k = 2, 3, 4.
11:::2,
(d) f(~) = (~ + i)/(l + d. Why do these curves
generally intersect at right angles? In your work.
experiment to get the best possible graphs. Also do the
same for other functions of your own choice. Observe
and record shortcomings of your CAS and means to
overcome such deficiencies.
17.2
(z -
124-281
16. CAS EXPERI:\IENT. Orthogonal Nets. Graph the
orthogonal net of the two families of level curves
Re f(z) = COIISt and 1m f(~) = comt, where
(a) f(z)
20. cosh 2:::
7T:::
+ a;: + b
30. Prove the statement in Prob. 29 for general k = I.
2, .... Hint. Use the Taylor series.
Linear Fractional Transformations
Conformal mappings can help in modeling and solving boundary value problems by first
mapping regions confommlly onto another. We shall explain this for standard regions
(disks. half-planes, strips) in the next section. For this it is useful to know properties of
special basic mappings. Accordingly, let us begin with the following very important class.
Linear fractional transformations (or Mobius transformations) are mappings
w=
(1)
a::.
cz
+b
+d
(ad - be
=1=
0)
where a, b, c, d are complex or real numbers. Differentiation gives
a(c::.
I
(2)
+ d)
H"
(c::.
- c(az
+
+ b)
ad - be
(c;::
d)2
+
d)2
This motivates our requirement ad - be =1= O. It implies conformality for all z and excludes
the totally uninteresting case w' == 0 once and for all. Special cases of (I) are
tr=::.+b
w
(3)
= a:::
w = a:::
w
=
liz
with
+b
(Tmns/atio11S )
lal =
1
(Rotations)
(Linear tmmjorl1latiolls)
(Inversion in the IInit circle).
SEC 17.2
735
Linear Fractional Transformations
E X AMP L E 1
= l/z (Fig. 384)
Properties of the Inversion w
In polar forms
z
= rew and w = Rei" the inversion
.~
Rel~
I
I.
e-'w
= -.- = retE}
11'
= liz. is
R=
and gives
r
,.
<1>=
-e.
Hence the unit circle k:1 = r = I is mapped onto the unit circle Iwl = R = I; II" = i<b = e -ill. For a general ~.
the image w = 1/;: can be found geometrically by marking Iwl = R = IIr on the segment from 0 to ;: and then
reflecting the mark in the real axis. (MaJ-.e a sketch.)
Figure 384 shows that w = 1/z maps horizontal and vertical straight lines onto circles or straight lines. Even
the following is true.
n' = 1/;:
maps el'en' stmight lille or circle Dllto {/ circle Dr straight lille.
v
y
I
I
I
L ---L...l
.--T
I
I
I
1
+--+--1---.1-----1
-2
L
-1
I
I
1
I
I
x
2
---+-+ +-I-+_---"I
~-l
I
I
I
I
Fig. 384.
Mapping (Inversion) w
= l/z
Proof. Every straight line or circle in the ;:-plane can be written
(A. B C, [) real).
A = 0 gives a straight line and A
'* 0 a circle. In terms of z and:': this equation becomes
-+"
A-::+B~+ C~+D=O.
--
Now
lI'
=
2
2i
11;:. Substitution of:: = IIII' and multiplication by II'W gives the equation
w+w
w-w
A + B - - - + C - - - + [)ww
2
2i
=
0
or. in terms of II and v.
A +
This represents a circle (if D
BII -
Cv +
DC/(2
'* 0) or a straight line (if D
=
+ v2 )
=
O.
0) in the w-plane.
•
The proof in this example suggests the use of z and Z instead of x and y. agelleral prillciple
that is often quite useful in practice.
Surprisingly, evn)' linear fractional transformation has the properly just proved:
THEOREM 1
Circles and Straight Lines
Every linear fractional trani/ormation (I) maps the totality of circles and STraighT
lines in the z-plane onto the totalitv of circles lind straight lines in the w-plane,
736
CHAP. 17
PROOF
Conformal Mapping
This is trivial for a translation or rotation. fairly obvious for a uniform expansion or
contraction. and true for w = 1/z, as just proved. Hence it also holds for composites of
these special mappings. Now comes the key idea of the proof: represent (1) in terms of
these special mappings. When e = 0, this is easy. When e =1= 0, the representation is
1
w=K--cz + d
+
a
ad - be
K= - - - e
where
e
This can be verified by substituting K. taking the common denominator and simplifying:
this yields (1). We can now set
WI
= ez,
and see from the previous formula that then 11' = lV4 + ale. This tells us that (1) is indeed
a composite of those special mappings and completes the proof.
•
Extended Complex Plane
The extended complex plane (the complex plane together with the point IX in Sec. 16.2)
can now be motivated even more naturally by linear fractional transformations as follows.
To each z for which ez + d =1= 0 there conesponds a unique w in (I). Now let e =1= O.
Then for z = -dIe we have ez + d = 0, so that no w conesponds to this ;:. This suggests
that we let IV = ex.; be the image of z = -dIe.
Also, the inverse mapping of (1) is obtained by solving (I) for z; this gives again a
linear fractional transformation
dw - b
z=
(4)
-ew
+a
When e =1= 0, then en' - a = 0 for w = ale, and we let ale be the image of;: = 00. With
these settings, the linear fractional transformation (1) is now a one-to-one mapping of the
extended z-plane onto the extended w-plane. We also ~ay that every linear fractional
transformation maps "the extended complex plane in a one-to-one manner onto itself."
Our discussion suggests the following.
General Remark. If z = 00. then the right side of (1) becomes the meaningless expression
(a· oo + b)/(c· 00 + d). We assign to it the value w = ale if e =1= 0 and w = 00 if e = O.
Fixed Points
Fixed points of a mapping w = .f(z) are points that are mapped onto themselves, are "kept
fixed" under the mapping. Thus they are obtained from
w
= .f(z) = z.
The identity mapping w = z has every point as a fixed point. The mapping w = Z has
infinitely many fixed points, w = liz has two, a rotation has one. and a translation none
in the finite plane. (Find them in each case.) For (1), the fixed-point condition w = z is
(5)
z=
az + b
ez
+d
'
thus
ez 2
-
(a - d)z - b
= O.
SEC. 17.3
Special Linear Fractional Transformations
737
This is a quadratic equation in z whose coefficients all vanish if and only if the mapping
is the identity mapping w = z (in this case, a = d oF 0, b = c = 0). Hence we have
Fixed Points
THEOREM 2
fr
A linear fractional transformation, not the identity, has at most two fixed points.
a linear fi'actional transformation is known to have three or more fixed points, it
must be the identity mapping w = z.
To make our present general discussion of linear fractional transfonnations even more
useful from a practical point of view, we extend it by further facts and typical examples,
in the problem set as well as in the next section.
-..-..............
. ...... - ........
,
1. Verify the calculations in the proof of Theorem 1.
18-141
2. (Composition ofLFTs) Show that substituting a linear
fractional transformation (LFf) into a LFf gives a
LFT.
Find the fixed points.
3. (Matrices) If you are familiar with 2 X 2 matrices,
prove that the coefficient matrices of (1) and (4) are
inverses of each other, provided ad - be = I, and
that the composition of LFfs corresponds to the
multiplication of the coefficient matrices.
Find the inverse;: = ;:(w). Check the result by solving ;:(w)
for II".
4;:
1\'
=
H' -
+i
+
-3;;:
7
6.
= 81z
10. w =
5
9. w = (4
z + 4i
z - I
12. w = - z+ 1
+ i)z
11. w = (z - i)2
13. w=
2iz - 1
z+
2i
+2
14. w = - - z- 1
3z
INVERSE
14-71
4.
8. w
FIXED POINTS
+
Z -
17.3
i
;
15. Find a LFT whose (only) fixed points are -2 and 2.
16. Find a LFT (not w = z) with fixed points 0 and l.
17. Find all LFfs with fixed points -I and 1.
3;:
5.
IV
=
7.
II'
= ---
18. Find all LFfs whose only fixed point is O.
2-
-
2;:
+ 5;
19. Find all LFfs with fixed points 0 and
4z
20. Find all LFfs without fixed points in the finite plane.
i
00.
Special Linear Fractional Transformations
In this section we shall see how to determine linear fractional transformations
(I)
w=
az +
eZ
b
+d
(ad - be
*- 0)
for mapping certain standard domains onto others and how to discuss properties of (I).
A mapping (1) is determined by a. b, c, d. actually by the ratios of three of these
constants to the fourth because we can drop or introduce a common factor. This makes it
plausible that three conditions determine a unique mapping (I):
738
CHAP. 17
THE 0 REM 1
Conformal Mapping
Three Points and Their Images Given
Three given distinct poil1ts :1' Z2, :::3 can always be mapped onto three prescribed
distinct poillts WI> W2, W3 by one, and only Olle, linear fractional transforlllation
W = .f(z). This mapping is given implicitly by the equation
(2)
tV -
WI
H'2 -
H'3
W -
W3
W2 -
WI
(rt" one of these poillts is the point
this point must be replaced by l.)
PROOF
x,
the qllotiellt of the two differences cOl/taining
Equation (2) is of the form F(w) = G(:) with linear fractional F and G. Hence
w = F-\G(z» = .f(z) , where F- 1 is the inverse of F and is linear fractional (see (4) in
Sec. 17.2) and so is the composite F- 1(G(z» (by Prob. 21). that is. w = fez) is linear
fractional. Now if in (2) we set w = Il·l. W2, W3 on the left and :: = Zl. Z2, :3 on the right.
we see that
F(Wl)
= 0.
F(W2)
=
G(:l)
= 0,
G(:2)
= 1.
I,
From the first column, F(Wl) = G(':l), thus WI = F-l(G(Zl» = f(Zl)' Similarly, W2 = f(Z2),
W3 = f(::3)' This proves the existence of the desired linear fractional transfOimation.
To prove uniqueness, let w = g(z) be a linear fractional transformation, which also
maps Zj onto Wj' j = 1,2,3. Thus Wj = g(Zj)' Hence g-llw) = Zj. where Wj = f(zj)'
Together, g-lUlzj » = Zj' a mapping with the three fixed points Zl, :2, :3' By Theorem 2
in Sec. 17.2, this is the identity mapping, g -l<f(:Z» = z for all z. Thus fl:) = g(z) for all
z, the uniqueness.
The last statement of Theorem I follows from the General Remark in Sec. 17.2.
•
Mapping of Standard Domains by Theorem 1
Using Theorem 1. we can now find linear fractional transformations according to the
following
Principle.
Prescribe three boundary points ZI> Z2, Z3 of the domain D in the z-plane.
Choose their images WI> W2, tl'3 on the boundary of the image D* of D in the w-plane.
Obtain the mapping from (2). Make sure that D is mapped onto D*, not onto its
complement. In the latter case, interchange two w-points. (Why does this help?)
E X AMP L E 1
Mapping of a Half-Plane onto a Disk (Fig. 385)
Find the linear fractional transformation (1) that maps
11'3 = I. respectively.
Solutioll.
Z1
=
-1, Z2
= 0':3 =
I onto
From (2) we obtain
w-(-I)
-i-I
~-(-l)
11"-1
-;-(-1)
z-I
thus
z -;
w=---iz + 1 .
0-1
0-(-1)'
W1
=
-I. H'2
=
-i,
SEC. 17.3
Special Linear Fractional Transformations
739
u
,,-y=o
~ /X=~
x=o
I
Fig. 385.
Linear fractional transformation in Example 1
Let us show that we can determine the specific properties of such a mapping without much calculation. For
x we have \I" = (x - il/(-ir + I). thus Iwl = I. so that the x-axis maps onto the unit circle. Since;: = i
gives IV = O. the upper half-plane maps onto the interior of that circle and the lower half-plane onto the exterior.
;: = O. i. x go onto \I" = -i. O. i. so that the positive imaginmy axis maps onto the segment S: II = 0, -1 ~ v ~ 1.
The vertical lines r = COllst map onto circles (by Theorem 1, Sec. 17.2) through w = i (the image of;: = x)
and perpendicular to 111'1 = I (by conformality; see Fig. 385). Similarly. the horizontal lines y = cOllstmap onto
circles through lI' = i and perpendicular to S (by conforrnality). Figure 385 gives these circles for y ~ O. and
for y < 0 they lie outside the unit disk shown.
•
: =
E X AMP L E 2
Occurrence of
00
Determine the linear fractional tram'[ormation that maps
11'3 = 1. respectively.
Solution.
ZI
= 0,
Z2
= 1, <=3 =
JO
onto
11'1
=
-1, 11'2
=
-i,
From (2) we obtain the desired mapping
U"
=
;: - i
z+i
This is sometimes called the Cayley trallsformatioll. 2 In this case. (2) gave at first the quotient (I - x)/(:: - x),
which we had to replace by 1.
•
E X AMP L E 3
Mapping of a Disk onto a Half-Plane
Find the linear tractional transformation that maps ::1 = -1, ::2 = i. Z3 = I onto "'I = O. '1'2 = i, "'3
respectively. such that the unit disk is mapped onto the right half-plane. (Sketch disk and half-plane.)
Solution.
=
x,
From (2) we obtain. after replacing (i - x)/(\\' - x) by I
;: +
1
\1"=---
;: - 1
•
Mapping half-planes onto half-planes is another task of practical interest. For instance,
we may wish to map the upper half-plane y ~ 0 onto the upper half-plane v ~ O. Then
the x-axis must be mapped onto the u-axis.
2
ARTHUR CAYLEY (1821-1895). English mathematician and professor at Cambridge. is known for his
important work in algebra. matrix theory. and ditferential equations.
~ .
740
E X AMP L E 4
CHAP. 17
Conformal Mapping
Mapping of a Half-Plane onto a Half-Plane
Find the linear fractional transfonnation that maps
= 3/8, respectively.
ZI =
-2,
:2 =
0,
;:3 =
2 onto
WI
:x:, '1'2 =
1/4,
w3
Solution.
You may verify that (2) gives the mapping function
Z
+
1
w=-2z + 4
•
What is the image of the x-axis? Of the y-axis?
Mappings of disks onto disks is a third class of practical problems. We may readily
verify that the unit disk in the z-plane is mapped onto the unit disk in the w-plane by the
following function, which maps Zo onto the center w = O.
z - Zo
w= - - ez - 1 '
(3)
=
To see this, take Izl
I, obtaining, with e
e = zo,
IZol < I.
= Zo as in (3),
= Iz - el
Iz - zol
= Izllz - el
= Izz
-
c zl =
11 -
e zl
= Ie z - II.
Hence
Iwl
= Iz - Zol/lcz -
II
= I
from (3), so that Izl = 1 maps onto Iwl = 1, as claimed, with Zo going onto O. as the
numerator in (3) shows.
Formula (3) is illustrated by the following example. Another interesting case will be
given in Prob. 10 of Sec. 18.2.
E X AMP L E 5
Mapping of the Unit Disk onto the Unit Disk
Taking Zo
= ~ in (3), we obtain (verify!)
2z - 1
w=---
(Fig. 386).
z-2
/:
~
I
,
I
J1o----r-I
I
:
I
I
~
v
\
\
o~~
i
/
u
1 x
~\-,~:--..--+!-;;;~
Fig. 386.
Mapping in Example 5
•
SEC. 17.3
741
Special Linear Fractional Transformations
E X AMP L E 6
Mapping of an Angular Region onto the Unit Disk
Certain mapping problems can be solved by combining linear fractional transformations with others. For instance,
to map the angular region D: -71/6 ~ arg z ~ 71/6 (Fig. 387) onto the unit disk Iwl ~ I, we may map D by
Z = Z3 onto the right Z-half-plane and then the latter onto the disk 111'1 ~ I by
~3 _
Z- I
w=i Z + I '
I
•
w=i--~3 + 1
combined
,/
\
11:/6
(z-plane)
(Z-plane)
Fig. 387.
(w-plane)
Mapping in Example 6
This is the end of our discussion of linear fractional transformations. In the next section
we tum to conformal mappings by other analytic functions (sine, cosine, etc.).
1. Derive the mapping in Example 2 from (2).
2. (Inverse) Find the inverse of the mapping in Example
1. Show that under that inverse the lines x = COlist are
the images of circles in the w-plane with centers on the
line v = 1.
3. Verify the formula (3) for disks.
4. Derive the mapping in Example 4 from (2). Find its
inverse and prove by calculation that it has the same
fixed points as the mapping itself. Is this surprising?
5. (Inverse) If w = f(z) is any transformation that has an
inverse, prove the (trivial!) fact that f and its inverse
have the same fixed points.
6. CAS
EXPERIMENT.
Linear
Fractional
Transformations (LFTs). (a) Graph typical regions
(squares, disks, etc.) and their images under the LFTs in
Examples 1-5.
(b) Make an experimental study of the continuous
dependence of LFTs on their coefficients. For instance,
change the LFT in Example 4 continuously and graph
the changing image of a fixed region (applying
animation if available).
7. -1. 0, I onto -0.6 - 0.8i, -1, -0.6. + 0.8i
8. 0, 1,2 onto I,!,
!
+
9. 2i. -2;.4 onto -4
2i, -4 - 2i, 0
10. i, -I, I onto -1, -i. i
11. O. I,
<Xl
onto
00,
1,0
12. 0, -i, i onto -1. 0,
13. 2i, i, 0 onto ~i, 2i,
00
00
14. O. 2i. -2i onto - I, O.
00
15. -1,0, I onto 0, I, -I
16. Find all LFTs w(:.':) that map the x-axis onto the £I-axis.
17. Find a LFT that maps 1:.0:1 ~ I onto Iwl ~ 1 so that
z = i/2 is mapped onto w = O. Sketch the images of
the lines x = COlist and y = COllst.
18. Find an analytic function that maps the second quadrant
of the z-plane onto the interior of the unit circle in the
w-plane.
LFTs FROM THREE POINTS AND
THEIR IMAGES
19. Find an analytic function w = i(z) that maps the region
o ~ arg z ~ 71/4 onto the unit disk Iwl ~ 1.
Find the LFT that maps the given three points onto the three
given points in the respective order.
20. (Composite) Show that the composite of two LFrs is
a LFT.
/7-15/
742
17.4
CHAP. 17
Conformal Mapping
Conformal Mapping by Other Functions
So far we have discussed the mapping by zn, eZ (Sec. 17.1) and linear fractional
transformations (Secs. 17.2, 17.3), and we shall now tum to the mapping by trigonometric
and hyperbolic analytic functions.
v
y
r rl
-2"
f--~
f-f--
Rl
u
" x
2
.
I I
I
(z-plane)
(w-plane)
Fig. 388.
Sine Function.
(I)
Mapping w
=
u
+ iv = sin z
Figure 388 shows the mapping by
w
= u + iv = sin z = sin x cosh y + i cos x sinh y
(Sec. 13.6).
Hence
(2)
II
=
sin x cosh y,
v = cos x sinh y.
Since sin ~ is periodic with period 21T, the mapping is certainly not one-to-one if we
consider it in the full z-plane. We restrict:: to the vertical strip S: -~1T ~ X ~ ~1T in
Fig. 388. Since f' (z) = cos z = 0 at z = ±!1T, the mapping is not conformal at these two
critical points. We claim that the rectangular net of straight lines x = const and y = const
in Fig. 388 is mapped onto a net in the w-plane consisting of hyperbolas (the images of
the vertical lines x = const) and ellipses (the images of the horizontal lines y = const)
intersecting the hyperbolas at right angles (confOimality!). Corresponding calculations are
simple. From (2) and the relations sin 2 x + cos 2 X = I and cosh2 y - sinh2 y = I we obtain
(Hyperbolas)
(Ellipses).
Exceptions are the vertical lines x = ±!1T, which are "folded" onto u ~ - 1 and
= 0), respectively.
Figure 389 illustrates this further. The upper and lower sides of the rectangle are mapped
onto semi-ellipses and the vertical sides onto -cosh I ~ If ~ -I and I ~ u ~ cosh I
(v = 0), respectively. An application to a potential problem will be given in Prob. 5 of
Sec. 18.2.
u ~ I (v
SEC. 17.4
743
Conformal Mapping by Other Functions
v
y
1
C
B
A
D
n;
n;
-2
2
-1
E
x
B"
C*
E*
F*
U
F
Fig. 389.
Mapping by w = sin z
The mapping w = cos.: could be discussed independently, but since
Cosine Function.
(3)
= cos Z = sin (z + !7T),
w
we see at once that this is the same mapping as sin z preceded by a translation to the right
through !7T units.
Hyperbolic Sine.
Since
(4)
\I"
= sinh z = -i sin (i.:),
the mapping is a counterclockwise rotation Z = i.: through!7T (i.e .. 90 0 ). followed by the
sine mapping Z* = sin Z. followed by a clockwise 90 0 -rotation w = -iZ*.
Hyperbolic Cosine.
This function
(5)
w
= cosh z = cos (d
defines a mapping that is a rotation Z = i.: followed by the mapping tv = cos Z.
Figure 390 shows the mapping of a semi-infinite strip onto a half-plane by w = cosh z.
Since cosh 0 = I, the point z = 0 is mapped onto w = I. For real.: = x ~ 0, cosh.: is
real and increases with increasing x in a monotone fashion. starting from I. Hence the
positive x-axis is mapped onto the portion 1I ~ I of the lI-axis.
For pure imaginary z = iy we have cosh iy = cos y. Hence the left boundary of the strip
is mapped onto the segment I ~ u ~ - I of the lI-axis, the point z = 7Ti conesponding to
w
= co.,h i7T = cos 7T =
-I.
On the upper boundary of the strip. y = 7T, and since sin 7T = 0 and cos 7T = -I, it follows
that this part of the boundary is mapped onto the portion 1I ~ -1 of the u-axis. Hence
the boundary of the strip is mapped onto the lI-axis. It is not difficult to see that the interior
of the strip is mapped onto the upper half of the w-plane. and the mapping is one-to-one.
This mapping in Fig. 390 has applications in potential theory, as we shall see in
Prob. 12 of Sec. 18.3.
~b ~,h; -x
Fig. 390.
-1
Mapping by w
0
=
1
cosh z
U
744
CHAP. 17
Conformal Mapping
Tangent Function. Figure 391 shows the mapping of a vertical infinite strip onto the
unit circle by w = tan z, accomplished in three steps as suggested by the representation
(Sec. 13.6)
(e iz - e-iz)/i
(e 2iZ - I )/i
smz
w=tanz=
e iz + e- iz
cos Z
e 2iz + I
Hence if we set Z = e 2iz and use lIi = -i, we have
(6)
w = tan
z= -
Z-1
W= Z+
iW.
I .
We now see that w = tan z is a linear fractional transformation preceded by an exponential
mapping (see Sec. 17.1) and followed by a clockwise rotation through an angle!7T (90 0 ) .
The strip is S: -~7T < X < ~7T, and we show that it is mapped onto the unit disk in the
w-plane. Since Z = e 2iz = e-2y+2ix, we see from (10) in Sec. 13.5 that Izi = e- 2y ,
Arg Z = 2x. Hence the vertical lines x = - 7T/4, 0, 7T/4 are mapped onto the rays
Arg Z = -7T!2, 0, 7T/2, respectively. Hence S is mapped onto the right Z-half-plane. Also
IZl = e- 2y < 1 if y > 0 and Izi > 1 if y < O. Hence the upper half of S is mapped inside
the unit circle Izi = 1 and the lower half of S outside Izi = L as shown in Fig. 391.
Now comes the linear fractional transformation in (6), which we denote by g(Z):
(7)
W
=
g(Z)
Z-l
=
Z+l
For real Z this is real. Hence the real Z-axis is mapped onto the real W-axis. FUl1hermore,
the imaginary Z-axis is mapped onto the unit circle IwJ = I because for pure imaginary
Z = iY we get from (7)
Iwi = Ig(iY)1 =
I
iY - I
iY + 1
I=
1.
The right Z-half-plane is mapped inside this unit circle Iwi = 1, not outside, because
Z = I has its image g(l) = 0 inside that circle. Finally, the unit circle IZI = 1 is mapped
y
,, .
I
- -,,,
I
I
I
\
(z-plane)
(Z-plane)
Fig. 391.
,, .
I
\
I
x
v
,,
I
'-
.'
~
I
(W-plane)
Mapping by w
=
tan z
- -,,,
\
I
I
I
\
\
,,
I
'-
.'
(w-plane)
~
I
u
SEC. 17.4
745
Conformal Mapping by Other Functions
onto the imaginary W-axis. because this circle is Z
expression. namely,
eie!>/2 _
eicb/2
= eie!>. so that (7) gives a pure imaginary
e- icbl2
i sin (¢/2)
+ e- i e!>/2
cos (¢12)
From the W-plane we get to the w-plane simply by a clockwise rotation through nIl; see (6).
Together we have shown that w = tan.: maps S: - nl4 < Re z < nl4 onto the unit
disk
= 1, with the four quarters of S mapped as indicated in Fig. 391. This mapping
is conformal and one-to-one.
Iwl
=
11-71 CONFORMAL MAPPING w
e'
Find and sketch the image of the given region under w = e 2 •
1. 0;;; x;;; 2, -71';;; y ;;; 71'
2. - I ;;; x ;;; 0, 0 ;;; y ;;; 71'12
3. -0.5 < x < 0.5, 371/4 < Y < 571'/4
4. -3 < x < 3, 71'14 < Y < 371'/4
5. 0 < x < 1. 0 < Y < 71'
6. x < 0, - 71'12 < Y < 71'12
7. x arbitrary, 0 ~ y ~ 2IT
8. CAS EXPERIMENT. Conformal Mapping. If your
CAS can do conformal mapping, use it to solve
Prob. 5. Then increase y beyond IT, say, to 5071' or 10071.
State what you expected. See what you get as the
image. Explain.
19-121 CONFORMAL MAPPING w = sinz
Find and sketch or graph the image of the given region
under w = sin z.
9. 0 ;;; x ;;; 71', 0 ;;; y ~ J
10. 0 < x < 71'/6, y arbitrary
11. 0 < x < 271', J < Y < 5
12. - 71'/4 < x < 71'/4, 0 < Y < 3
13. Determine all points at which H" = sin Z IS not
conformal.
14. Find and sketch or graph the images of the lines x = O.
±71'/6, ±71'/3. ±71'/2 under the mapping H" = sin z.
15. Find an analytic function that maps the region R
bounded by the positive x- and .v-axes and the hyperbola
X)' = 71'12 in the first quadrant onto the upper half-plane.
Hint. First map the region onto a horizontal strip.
16. Describe the mapping H" = cosh z in terms of the
mapping w = sin z and rotations and translations.
17. Find all points at which the mapping w = cosh 71;;: is
not conformal.
118-221
CONFORMAL MAPPING w = cos z
Find and sketch or graph the image of the given region
under w = cos ;;:.
18. 0 < x < 71'12. 0 < Y < 2
19. 0 < x < 71'. 0 < \. < I
20. - I ~ x ;;; 1. 0 ~ y ~
21.
71'
< x < 271'. y < 0
22. 0 < x < 271. l/2 < Y <
23. Find the images of the lines \'
mapping w = cos .<:.
zz+
= C = COllst under the
I
24. Show that w = Ln - - maps the upper half-plane
I
onto the horizontal strip 0
the figure.
A
~
1m w
~ 71'
BCD
!
(=)
-1
)
E
I
0
as shown in
(=)
(z-plane)
1[i
6----
c"
D*(=J
E" = A *
B*(=J
6----
o
(w-planeJ
Problem 24
25. Find and sketch the image of R: 2 ;;; Izl ~ 3,
71'/4 ~ () ~ 71'12 under the mapping w = Ln z.
746
17.5
CHAP. 17
Conformal Mapping
Optional
Riemann Surfaces.
Riemann sUifaces are sUifaces on willch multivalued relations, such as w = v':: or w = In z,
become single-valued, that is, functions in the usual sense. We explain the idea. which is
simple-but ingenious, one of the greatest in complex analysis.
The mapping given by
w
(I)
=
u
+ iv = .:2
(Sec. 17.1)
is conformal. except at ;: = 0, where w' = 2.:: = O. At Z = 0, angles are doubled under
the mapping. Thus the right .:-half-plane (including the positive y-axis) is mapped onto
the full w-plane, cut along the negative half of the £I-axis; this mapping is one-to-one.
Similarly for the left z-half-plane (including the negative y-axis). Hence the image of the
full .:-plane under w = Z2 "covers the w-plane twice" in the sense that every w *- 0 is the
image of two z-points: if ZI is one. the other is -Zl. For example. z = i and -i are both
mapped onto w = - 1.
Now comes the crucial idea. We place those two copies of the cut w-plane upon each
other so that the upper sheet is the image of the right half .::-plane R and the lower sheet
is the image of the left half .::-plane L. We join the two sheets crosswise along the cuts
(along the negative u-axis) so that if;: moves from R to L. its image can move from the
upper to the lower sheet. The two origins are fastened together because w = 0 is the image
of just one .::-point, z = o. The surface obtained is called a Riemann surface (Fig. 392a).
w = 0 is called a "winding point" or branch point. w = Z2 maps the full z-plane onto
tills surface in a one-to-one manner.
By interchanging the roles of the variables
relation
z and w it follows that the double-valued
w=~
(2)
(Sec. 13.2)
becomes single-valued on the Riemann surface in Fig. 392a, that is, a function in the usual
Its image is
sense. We can let the upper sheet conespond to the principal value of
the right w-half-plane. The other sheet is then mapped onto the left w-half-plane.
-vz.
(a) Riemann surface of
Vi
Fig. 392.
(b) Riemann surface of
"Vz
Riemann surfaces
\YZ
Similarly, the triple-valued relation w =
becomes single-valued on the three-sheeted
Riemann surface in Fig. 392b, which also has a branch point at z = o.
Chapter 17 Review Questions and Problems
747
The infinitely many-valued natural logarithm (Sec. 13.7)
w
=
=
In ~
+
Ln ~
(11
21l'lTi
= 0, ±l. ±2.... )
becomes single-valued on a Riemann surface consisting of infinitely many sheets. w = Ln::.
corresponds to one of them. This sheet is cut along the negative x-axis and the upper edge
of the slit is joined to the lower edge of the next sheet, which con·esponds to the argument
'IT' < () ~ 317. that is, to
w
=
Ln::.
+
2ITi.
The principal value Ln ~ maps its sheet onto the horizontal strip -17 < v ~ 17. The function
l1' = Ln ~ + 2 m maps its sheet onto the neighboring strip 17 < V ~ 317, and so on. The
mapping of the points z *- 0 of the Riemann surface onto the points of the w-plane is
one-to-one. See also Example 5 in Sec. 17.1.
H· = ~. Find the path of the image point U'
of a point::. thal moves twice around the unit circle.
starting from the initial position::. = 1.
two sheets that may be cut along the line segment from
I to 2 and joined crosswise. HillT. Introduce polar
coordinates:: - I = rleiH1. : - 2 = r 2 e i62 .
1. Consider
2. Show that the Riemann surface of w = ~ consists of
11 sheets and has a branch point at ::. = o.
3. Make a sketch. similar to Fig. 392, of the Riemann
surface of ~.
15-101
7. 5
4. Shov. that the Riemann surtace of II· = Y(:: - 1)(: - 2)
has branch points at ::. = I and::. = 2 and consi~ts of
£ .--
RIEMANN SURFACES
Find the branch points and the number of sheets of the
Riemann surface.
6. \/(1 - ?)(4 - ::2)
5. \/3: + 5
+
V'2:
+
8. In (3: - 4i)
i
9. e V2
10.
W
=..IfES T ION SAN 0 PRO B L EMS
1. How did we define the angle of imersection of two
oriented curves, and what does it mean to say that a
mapping is conformal?
2. At what points is a mapping w = f(::) by an analytic
function not confonnal? Gi ve examples.
3. What happens to angles at::o under a mapping w = f(:)
if J' (Zo) = 0, f"(::o} = o. f"'(::.o) *- O?
4. What do "surjective." "injective." and "'bijective"
mean?
5. What mapping gave the 10ukowski airfoil?
6. What are linear fractional transformations (LFTs)? Why
are they important in connection with the extended
complex plane?
7. Why did we require that ad - be *- 0 for a LFT?
8. What are fixed points of a mapping? Give examples.
9. Can you remember mapping properties of II· = sin::.?
cos:? e Z ?
10. What is a Riemann surface? Why was it imroduced?
Explain the simplest example.
111-161
MAPPING w
= Z2
Find and sketch the image of the given curve or region under
1V
= Z2.
13. Izl = 4.5, larg zl < nl4
12. xy = -4
14.0 < Y < 2
15. ! < x
<
16.
117-22/
MAPPING w = l/z
11. Y = -1, y = I
1
1m:: > 0
Find and sketch the image of the gi ven curve or region under
w = II::.
17. x = -1
18. Y = 1
CHAP. 17
748
19. Iz - ~I = ~
21. larg zl < 7T/4
20.
22.
Conformal Mapping
Izl
Izl
< ~, )' < 0
< I, x < 0, Y > 0
135-40 1 Fixed Points. Find all fixed points of
z+2
35. w = - -
36.
Where is the mapping by the given function not conformal?
(Give reason.)
23. 5;:7 + 7;:5
24. cosh 2~
3z + 2
37. u · = - - -
i~ + 5
38.11'= - -
25. sin 2~ + cos 2~
27. exp (;:4 + Z2)
39. "',. =
123-281
129-341
26. cos 71'Z 2
28. z +
l/~
(z =/= 0)
LINEAR FRACTIONAL
TRANSFORMATIONS (LFTs)
29. 0, 1, 2 onto 0, i, 2i, respectively
30. -1. 1, 2 onto O. 2, 312, respectively
31. 1, -1, -i onto I, -I, i, respectively
32. -1, -I, i onto 1 - i, 2, 0, respectively
GO.
1
z-I
141-451
(2
+ i) z +
: + 2i
5~
I
z- ;
40.
lI'
= ~4
+i
+
~
- 81
GIVEN REGIONS
Find an analytic function II' = .f(z) that maps:
41. The infinite strip 0 < Y < 71'/3 onto the upper half-plane
v> O.
Find the LFT that maps
33. 0,
+
~
FAILURE OF CON FORMALITY
-
2i~
1
\1'= - - -
-2 onto O. 1. "". respectively
34. O. i, 2i onto 0, x, 2i
42. The intelior of the unit circle Izl = I onto the exterior
of the circle Iw + 11 = 5.
43. The region x > 0, Y > 0, xy < k omo the strip
O<v<1.
44. The semi-disk Izl < 1. x > 0 onto the exterior of the
unit circle 111'1 = 1.
45. The sector 0 < arg Z < 71'/3 onto the region u < 1.
Conformal Mapping
A complex function w = f(::.) gives a mapping of its domain of definition in the
complex z-plane onto its range of values in the complex no-plane. If fez) is analytic,
this mapping is conformal, that is, angle-preserving: the images of any two
intersecting curves make the same angle of intersection, in both magnitude and sense,
as the curves themselves (Sec. 17.1). Exceptions are the point" at which!' (z) = 0
("critical points," e.g. z = 0 for w = Z2).
For mapping properties of eZ , cos z, sin z, etc. see Secs. 17.1 and I7.4.
Linear fractional transformations, also called Mobius tran~formations
(1)
w=
az + b
cz + d
(Secs. 17.2, 17.3)
(ad - bc *- 0) map the extended complex plane (Sec. 17.2) onto itself. They solve
the problems of mapping half-planes onto half-planes or disks, and disks onto disks
or half-planes. Prescribing the images of three points determines (I) uniquely.
Riemann surfaces (Sec. 17.5) consist of several sheets connected at certain points
called brallch poil1ls. On them, multi valued relations become single-valued, that is,
functions in the usual sense. Examples. For w = Vz we need two sheets (with
branch point 0) since this relation is doubly-valued. For no = In::. we need infinitely
many sheets since this relation is infinitely many-valued (see Sec. 13.7).
18
/
Complex Analysis and
Potential Theory
Laplaces's equation V2~ = 0 is one of the most important PDEs in engineering
mathematics, because it occurs in gravitation (Secs. 9.7, 12.10), electrostatics (Sec. 9.7),
steady-state heat conduction (Sec. 12.5), incompressible fluid flow, etc. The theory of
solutions of this equation is called potential theory (although "potential" is also used in
a more general sense in connection with gradients; see Sec. 9.7).
In the "two-dimensional case" when ~ depend~ only on two Cartesian coordinates x
and y, Laplace's equation becomes
From Sec. 13.4 we know that then its solutions ~ are closely related to complex analytic
functions ~ + i '\{t. This relation is the main reason for the importance of complex analysis
in physics and engineering. (We use the notation <l> + i'\{t since u + iv will be needed
in conformal mapping.)
In this chapter we shall consider this connection and its consequences in detail and
illustrate it by modeling typical examples from electrostatics (Secs. 18.1. 18.2), heat
conduction (Sec. 18.3). and hydrodynamics (Sec. 18.4). This will lead to boundary value
problems, some of which involving functions whose mapping properties we have studied
in Chap. 17. Further relating to that chapter. in Sec. 18.2 we explain conformal mapping
as a method in potential theory.
In Sec. 18.5 we derive the important Poisson formula for potentials in a circular disk.
Finally, in Sec. 18.6 we show that results on analytic functions can be used to
characterize general properties of harmonic functions (solutions of Laplace's equation
whose second partial derivatives are continuous).
Prerequisite: Chaps. 13, 14. 17.
References and Answers to Problems: App. 1 Part D, App. 2.
749
CHAP. 18
750
18.1
Complex Analysis and Potential Theory
Electrostatic Fields
The electrical force of attraction or repulsion between charged particles is governed by
Coulomb's law. This force is the gradient of a function ¢, called the electrostatic
potential. At any points free of charges, ¢ is a solution of Laplace's equation
The surfaces ¢ = const are called equipotential surfaces. At each point P at which the
gradient of ¢ is not the zero vector, it is perpendicular to the surface ¢ = const through
P; that is. the electrical force has the direction perpendicular to the equipotential surface.
(See also Secs. 9.7 and 12.10.)
The problems we shall discuss in this entire chapter are two-dimensional (for the reason
just given in the chapter opening), that is, they model physical systems that lie in
three-dimensional space (of course!). but are such that the potential <P is independent of
one of the space coordinates. so that ¢ depends only on two coordinates. which we call
x and y. Then Laplace's equation becomes
(1)
Equipotential surfaces now appear as equipotential lines (curves) in the xy-plane.
Let us illustrate these ideas by a few simple basic examples.
E X AMP L E 1
Potential Between Parallel Plates
Find the potential If> of the field between two parallel conducting plates extending to infinity (Fig. 393). which
are kept at potentials If>l and <1>2. respectively.
Soluti01l. From the shape of the plates it follows that <I> depends only on x. and Laplace', equation becomes
<1>" = O. By integrating twice we obtain <I> = 0\' + b. where the constants 1I and b are determined by the given
boundary values of (I> on the plates. For example. if the plates correspond to x
=
-I and x
=
I, the solution is
•
The equipotential surfaces are parallel planes.
x
Fig. 393.
Potential in Example 1
x
Fig. 394.
Potential in Example 2
SEC. 18.1
751
Electrostatic Fields
E X AMP L E 2
Potential Between Coaxial Cylinders
Find the potential <I> between two coaxial conducting cylinders extending to infinity on both ends (Fig. 394)
and kept at potentials <1>1 and <1>2' respectively.
\1.,2 l,
Solution.
Here <I> depends only on r =
+
for reasons of symmetry, and Laplace's equation
lIee = 0 [(5). Sec. 12.9] with tlee = 0 and II = <I> becomes r'I)" + <1>' = O. By separating variable~
and integrating we obtain
r2lf.,"'r
+
ru,.
+
<1>"
<1>'
r
In <1>' = -In r +
a.
<I>
,
a
=-.
r
<I> = a In,. + b
and a and b are determined by the given values of <l> on the cylinders. Although no infinitely extended conductors
exist, the field in our idealiLed conductor will approximate the field in a long finite conductor in that part which
•
is far away from the two ends of the cylinders.
E X AMP L E 3
Potential in an Angular Region
Find the potential <I> between the conducting plates in Fig. 395. which are kept at potentials <1>1 <the lower plate)
and <1>2' and make an angle a. where 0 < a::;;: 7T. lin the figure we have a = 120 0 = 27T/3.)
Y
*
Arg ~ (: = 1: + iy 0) b constant on rays e = conST. It is harmonic since it is the imaginary
part of an analytic function, Ln ~ (Sec. 13.7). Hence the solution is
Solution. e =
<I>(x. y) = 0+ b Arg<:
x
with a and b determined from the two boundary conditions (given values on the plates)
Fig. 395. Potential
in Example 3
e=
)'
arctan _.
x
•
Complex Potential
Let CP(x, y) be hannonic in some domain D and qr(x, y) a hannonic conjugate of cP in D.
(See Sec. 13.4. where we wrote 1I and v, now needed in conformal mapping from the next
section on: hence the change to cP and qr.) Then
(2)
F(z)
=
CP(x, y) + iqr(x, y)
is an analytic function of z = x + iy. This function F is called the complex potential
corresponding to the real potential CPo Recall from Sec. 13.4 that for given CP, a conjugate
qr is uniquely determined except for an additive real constant. Hence we may say the
complex potential, without causing misunderstandings.
The use of F has two advantages. a technical one and a physical one. Technically. F is
easier to handle than real or imaginary parts, in connection with methods of complex
analysis. Physically, -qr has a meaning. By conformality, the curves qr = const intersect
the equipotential lines cP = const in the xy-plane at right angles [except where F' (z) = 0].
Hence they have the direction of the electrical force and. therefore, are called lines of force.
They are the paths of moving charged particles (electrons in an electron microscope, etc.).
E X AMP L E 4
Complex Potential
In Example I, a conjugate is 'I' =
01'.
It follows that the complex potential is
F(;:) = a:
+
b = ax
+
b
+ iay,
752
CHAP. 18
Complex Analysis and Potential Theory
•
and the lines of force are horizontal straight lines y = comt parallel to the t-axis.
E X AMP L E 5
Complex Potential
In Example 2 we have <1>
potential is
= a In r +
b
= a In Izl +
h. A conjugate is q,
F(z) = a Ln z
+
= a Arg z. Hence the complex
b
and the lines of force are straight lines through the origin. F(;:) may also be interpreted as the complex potential
of a source line (a wire perpendicular to the xy-plane) whose trace in the xr-plane is the origin.
•
E X AMP L E 6
Complex Potential
In Example 3 we get F(:::) by noting that i Ln ::: = i In
kl -
F(:;;) = a - ib Ln z = a
+
Arg:::, multiplying this by -b. and adding a:
b Arg::: - ib In
We see from this that the lines of force are concentric circles
Izi
= COllst.
Id.
Can you sketch them?
•
Superposition
More complicated potentials can often be obtained by superposition.
E X AMP L E 7
Potential of a Pair of Source Lines (a Pair of Charged Wires)
Determine the potential of a pair of oppositely charged source lines of the same 5trength at the
-c on the real axis.
point~
- = c and
z=
Soluti01l.
From Examples 2 and 5 it follows that the pOlential of each of the source lines is
<1>1 = Kin
Iz - cl
and
<1>2 = -K In
Iz + cl,
re5pectively. Here the real constant K measures the strength (amount of charge). These are the real parts of the
complex potentials
and
Hence the complex pOlential of the combination of the two source lines is
(3)
F(z) = F 1 (:::)
+ F 2 (:::) = K fLn
(~
- c) - Ln (::: + c»).
The equipotential lines are the curves
<1> = Re F(z) = K In
I
Z Z
c
+ c
I=
consc.
I~7- -+ I
thus
...
(c.·
=
("01lSt.
These are circles, as you may show by direct calculation. The lines of force are
"IjJ
= [m F(z) =
K[ Arg (z - c) - Arg (z
+ c)l =
const.
We write this briefly (Fig. 396)
Now 81 - 82 is the angle between the line segments from:: to c and -c (Fig. 396). Hence the lines of force
are the curves along each of which the line segment S: -c ~ t ~ C appears under a consram angle. These curves
are the totality of circular arcs over S. as is (or should be) known from elementary geometry. Hence the lines
of force are circles. Figure 397 shows some of them together with some equipotential lines.
In addition to the interpretation as the potential of two source lines, this potential could also be thought of a~
the potential between two circular cylinders whose axes are parallel but do not coincide. or a5 the potential
between two equal cylinders that lie outside each other. or as the potential hetween a cylinder and a plane waH.
Explain this, using Fig. 397.
•
The idea of the complex potential as just explained is the key to a close relation of potential
theory to complex analysis and will recur in heat flow and fluid flow.
SEC. 18.1
753
Electrostatic Fields
z
-c
c
Arguments in Example 7
Fig. 396.
........
=..·'''''''-'·W·
Fig. 397. Equipotential lines and lines
of force (dashed) in Example 7
=:1- .. - - Z- - •.
-
••
11-41
POTENTIAL
Find and sketch the potential. Find the complex potential:
1. Between parallel plates at x = - 3 and 3. potentials
140 V and 260 V. respectively
2. Between parallel plates at t = -4 and 10. potentials
4.4 kV and 10 kV. respectively
3. Between the axes (potential 110 V) and the hyperbola
xy = I (potential 60 V)
113-151
POTENTIALS FOR OTHER
CONFIGURATIONS
13. Show that F(z) = arccos z (defined in Problem Set
13.7) gives the potential in Figs. 398 and 399.
4. Between parallel plates at y = x and x + k. potentials
o and
15-81
100 V. respectively
COAXIAL CYLINDERS
Find the potential between two infinite coaxial cylinders of
radii r 1 and rz having potentials VI and V z , respectively.
Find the complex potential.
5. rl = 0.5, rz = 2.0, VI = -IIOV. V 2 = llOV
6.
7.
8.
1'1
1'1
1'1
= 1, r2 = 10. VI = 100 V. V 2 = I kV
= I, 1"z = 4. VI = 200 V. V2 = 0
= 0.1. 1'2 = 10. VI = 150 V. V 2 = 50 V
Fig. 398.
Slit
9. Show that <1>
= el'Tr = (l/'Tr) arctan (ylx) is harmonic in
the upper half-plane and satisfies the boundary condition
<1>(x, 0) = I if x < 0 and 0 if x > O. and the cOlTesponding
complex potential is F(z) = -(i/'Tr) Ln z.
10. Map the upper half z-plane onto the unit disk Iwl ~ I so
that 0, x. - I are mapped onto I, i, -i, respectively. What
are the boundary conditions on Iwl = I resulting from
the potential in Prob. 9'! What is the potential at w = O?
n. Verify by calculation that the equipotential lines in
Example 7 are circles.
12. CAS EXPERIMENT. Complex Potentials. Graph
the equipotential lines and lines offorce in (a)-(d) (four
graphs, Re F(z) and 1m F(z) on the same axes). Then
explore further complex potentials of your choice with
the purpose of discovering configurations that might
be of practical interest.
(a) F(z) = :;:2
(b) F(z) = iz 2
(e) F(z) = liz
(d) F(:;:) = ifz
Fig. 399.
Other apertures
14. Find the real and complex potentials in the sector
-'Tr16 ~ e ~ 'Tr16 between the boundary e = ± 'Tr16
(kept at 0) and the curve X3 - 3xy2 = I, kept at 110 V.
15. Find the potential in the first quadrant of the x)'-plane
between the axes (having potential 220 V) and the
hyperbola xy = I (having potential 110 V).
CHAP. 18
754
18.2
Complex Analysis and Potential Theory
Use of Conformal Mapping. Modeling
Complex potentials relate potential theory closely to complex analysis, as we have just
seen. Another close relation results from the use of conformal mapping in modeling and
solving boundary value problems for the Laplace equation, that is, in finding a solution
of the equation in some domain assuming given values on the boundary ("Dirichlet
problem"; see also Sec. 12.5). Then conformal mapping is used to map a given domain
onto one for which the solution is known or can be found more easily. This solution is
then mapped back to the given domain. This is the idea. That it works is due to the fact
that harmonic functions remain harmonic under conformal mapping:
Harmonic Functions Under Conformal Mapping
THEOREM 1
Let <1>* be harmonic in a domain D* il1 the tv-plane. Suppose that H' = u + iv = fez)
is analytic in a domain D in the z-plane and maps D cOflformally onto D*. Then
the jUllction
<1>(x, y)
(1)
= <1>*(II(X, y), vex, y»
is harmollic in D.
PROOF
E X AMP L E 1
The composite of analytic functions is analytic, as follows from the chain IUle. Hence,
taking a harmonic conjugate 1]r*(u, v) of <1>*, as defined in Sec. 13.4, and forming the
analytic function F*(w) = <1>*(u, v) + i1]r*(u, v), we conclude that F(z) = F*(J(z» is
analytic in D. Hence its real part <1>(x, y) = Re F(z) is hannonic in D. This completes the
proof.
We mention without proof that if D* is simply connected (Sec. 14.2), then a harmonic
conjugate of <1>* exists. Another proof of Theorem 1 without the use of a harmonic
•
conjugate is given in App. 4.
Potential Between Noncoaxial Cylinders
Model the electrostatic potential between the cylinders C I : IzI = I and C2 : J;: - 2/51 = 2/5 in Fig. 400. Then
give the solution for the case that C I is grounded, VI = 0 V, and C2 has the potential V 2 = 110 v.
We map the unit disk Izl = I onto the unit disk Iwl = I in such a way that C2 is mapped onto
some cylinder C2 *: Iwl = '-0' By (3), Sec. 17.3, a linear fractional transfonTlation mapping the unit disk onto
the unit disk is
Solution.
z- b
u·= - - b::: - 1
(2)
x
(a) z-plane
u
(bl w·plane
Fig. 400.
Example 1
SEC. 18.2
Use of Conformal Mapping. Modeling
755
where we have chosen b = ~o real without restriction. ~o is of no immediate help here because centers of circles
do not map onto centers of the images. in general. However. we now have two free constants band 1'0 and shall
succeed by imposing two reasonable conditions, namely, that 0 and 4/5 (Fig. 400) should be mapped onto 1'0
and -ro, respectively. This gives by (2)
O-b
'"0 = 0 _ I = h.
4/5 - b
and with this,
415 - ro
-ro = 4bl5 - I
4"0/5 - I
a quadratic equation in "0 with solutions "0 = 2 (no good because "0 < I) and "0 = 112. Hence our mapping
function (2) with b = 112 becomes that in Example 5 of Sec. 17.3,
(3)
From Example 5 in Sec. 18.1. writing lI" for z we have as the complex potential in the lI"-plane the function
a Ln w + k and from this the real potential
F*{II') =
<1>''(u. v) = Re F*{II') = a In
[wi + k.
This is our model. We now determine a and k from the boundary conditions. If [wi = I, then <1>* = {/ In I + k = O.
hence k = O. If [wi = ro = 112. then <1>" = {/ In (I/2) = 110. hence (/ = 110/ln (l12) = -158.7. Substitution
of (3) now gives the desired ~olution in the given domain in the :::-plane
F(:::) =
F*(f(~» =
1I
")- - I
Ln ~- _ 2 .
The real potential is
<1>{x. y)
= Re F(:::)
= {/ In
I
?, -
I
~"_ 2
I
.
(/ = -158.7.
Can we "see" this re,ult? Well. <1>(x. Y) = COIlst if and only if [(2::: - 1)/(::: - 2)[ = COllst. that is. [wi = CUllst
by (2) with b = 112. These circles are images of circles in the :::-plane because the inverse of a linear fractional
transformation is linear fractional (see (4). Sec. 17.2). and any such mapping maps circles onto circles (or
straight lines). by Theorem I in Sec. 17.2. Similarly for the rays arg II' = COllst. Hence the equipotential lines
<1>{x, y) = const are circles. and the lines of force are circular arcs (dashed in Fig. 400). These two familie< of
curves intersect orthogonally. that is, at right angles. as shown in Fig. 400.
•
E X AMP L E 1
Potential Between Two Semicircular Plates
Model the potential between two semicircular plates PI and P 2 in Fig. 40Ia having potentials -3000 V and
3000 V. respectively. Use Example 3 in Sec. 18.1 and conformal mapping.
Solution.
Step 1. We map the unit disk in Fig. 40la onto the right half of the II'-plane (Fig. 401b) by using
the linear fractional transformation in Example 3, Sec. 17.3:
II'
= I(:::) =
I+z.
1-:::
v
2 kV
3 kV -
1 kV
o
u
,
-3 kV
-_..-'\
-2 kV
(a) z-plane
(b) w-plane
Fig. 401.
Example 2
756
CHAP. 18
Complex Analysis and Potential Theory
The boundary Izl = I is mapped onto the boundary II = 0 lthe v-axis). with z = -I. i. I going onto w = O. i, "",
respectively, and: = -i onto w = - i. Hence the upper semicircle of Izl = I is mapped onto the upper half,
and the lower semicircle onto the lower half of the v-axis. so that the boundary conditions in the w-plane are
as indicated in Fig. 401b.
Step 2. We determine the potential <1>*(1/, v) in the right half-plane of the w-plane. Example 3 in Sec. 18.1 with
a = 7T, VI = -3000. and V 2 = 3000 [with <1>*(11. v) instead of <1>(x. y)] yields
6000
v
cp = arctan - .
<1>*(u, v) = - - cpo
7T
11
On the positive half of the imaginary axis (cp = 7T12), this equals 3000 and on the negative half -3000. as it
should be. <1>' is the real part of the complex potential
"
F'(lI') = -
6000 i
---
7T
Ln w.
Step 3. We substitllle the mapping function into F* to get the complex potential F(::.) in Fig. 40la in the
form
6000i
1+;:
F(;:) = F*(f(;:» = - - - Ln - - .
'if
1-:;:
The real part of this is the potential we wanted to determine:
6000
I+z
<1>(x, y) = Re F(;:) = - - 1m Ln - 7T
I - ;:
6000
- - A rn
71
0
I+z
1 -::
--
As in Example I we conclude that the equipotential lines <1>(x. y) = const are circular arcs because they correspond
to Arg [(I + ::.)/(1 - ;:)] = COIISt. hence to Arg w = COliS/' Also, Arg w = COIISt are rays trom 0 to x, the images
of Z = -I and;: = I. respectively. Hence the equipotential lines all have -I and I (the points where the
boundary potential jumps) as their endpoints lFig. 401a). The lines of fonce are circular aln, too, and since they
must be orthogonal to the equipotential lines, their centers can be obtained as intersections of tangents to the
unit circle with the ,,-axis, (Explain!)
•
Further examples can easily be constructed. Just take any mapping w = J(:) in Chap. 17.
a domain D in the z-plane, its image D* in the w-plane. and a potential <1>* in D~'. Then
(1) gives a potential in D. Make up some examples of your own. involving, for instance.
linear fractional transformations.
Basic Comment on Modeling
We formulated the examples in this section as models on the electrostatic potential. It
is quite important to realize that this is accidental. We could equally well have phrased
everything in terms of (time-independent) heat flow; then instead of voltages we would
have had temperatures, the equipotential lines would have become isotherms (= lines
of constant temperature), and the lines of the electrical force would have become lines
along which heat flows from higher to lower temperatures (more on this in the next
section). Or we could have talked about fluid flow; then the electrostatic lines of force
would have become streamlines (more on this in Sec. 18.4). What we again see here is
the unifyillg power of mathematics: different phenomena and systems from different
areas in physics having the same types of model can be treated by the same mathematical
methods. What differs from area to area is just the kinds of problems that are of practical
interest.
SEC 18.3
757
Heat Problems
=:.:I:_=.: ..... 'A~.====:
....
1. Verify Theorem 1 for <1>':'(11. U) = 112 - U 2 •
Z
W = fez) = e and any domain D.
2. Verify Theorem I for <1>*(lI. u) = lIU, W = .Hz) = e Z •
and D: x ::::2 0,0::::2 Y ::::2 7T, Sketch D and D*.
3. Carry out all steps of the second proof of Theorem I
(given in App. 4) in detaiL
4. Derive (3) from (2).
5. Let D'~ be the image of the rectangle D:
o ~ x ::::2 ~ 7T, 0 ::::2 Y ::::2 1 under IV = sin z, and
<1>*(11, U) = 112 - u 2 • Find the corresponding
potential <1> in D and its boundary values.
6. What happens in Prob. 5 if you replace the potential
by the conjugate <1>* = 2ltu? Sketch or graph some of
the equipotential lines <1> = canst.
7. CAS PROJECT. Graphing Potential Fields.
(a) Graph equipotential lines in Probs. 1 and 2.
(b) Graph equipOtential lines if the complex potential
is F(z) = i;:.2. F(z) = e Z • F(;:.) = ie z , F(z) = eiz .
(c) Graph equipotential surfaces corresponding to
F(z) = In z as cylinders in space.
8. TEAM PROJECT. Noncoaxial Cylinders. Find the
potential between the cylinders C 1 : Izl = I (potential
VI = 0) and C2 : Iz - cl = c (V2 = 110 V), where
o < c < ~. Sketch or graph the equipotential curves
and their orthogonal trajectories for c = 0.1, 0.2, 0.3,
OA. Try to think of the further extension C 1 : Izi = 1,
C2 : I::: - cI = p *- c.
9. Find the potential <1> in the region R in the first quadrant
of the z-plane bounded by the axes (having potential
VI) and the hyperbola y = l/x (having potential 0) in
two ways. (i) directly, (ii) by mapping R onto a suitable
infinite strip,
18.3
10. (Extension of Example 2) Find the linear fractional
transfonnation z = g(Z) that maps IZI ::::2 I onto Izi ~ 1
with Z = il2 being mapped onto Z = O. Show that
ZI = 0.6 + 0.8i is mapped onto z = -I and
Z2 = -0.6 + 0.8i onto::: = I, so that the equipotential
lines of Example 2 look in Izi ::::2 1 as shown in Fig. 402.
x
Fig. 402.
Problem 10
11. The equipotential lines in Prob. 10 are circles. Why?
12. Show that in Example 2 the .v-axis is mapped onto the
unit circle in the w-plane.
13. Find the complex and real potentials in the upper
half-plane with boundary values 0 if x < 4 and 10 kV
if x > 4 on the x-axis.
14. (Angular region) Applying a suitable conformal
mapping. obtain from Fig. 401 b the potential <1> in the
angular region -!7T < Arg z < !7T such that <l> = - 3 kV
if Arg z = -!7T and <1> = 3 kV if Arg z = !7T.
15. At z = ± 1 in Fig. 401 a the tangents to the equipotential
lines shown make equal angles (7T/6). Why?
Heat Problems
Laplace's equation also governs heat flow problems that are steady, that is, time-independent.
Indeed, heat conduction in a body of homogeneous material is modeled by the heat
equation
where the function T is temperature, Tt = aT/at, t is time. and c 2 is a positive constant
(depending on the material of the body; see Sec. 12.5). Hence if a problem is steady, so
that Tt = O. and two-dimensional, then the heat equation reduces to the two-dimensional
Laplace equation
(1)
so that the problem can be treated by our present methods.
758
CHAP. 18
Complex Analysis and Potential Theory
T(x, y) is called the heat potential. It is the real part of the complex heat potential
F(::.)
=
T(x. y)
+ i'llF(x. y).
The curves T(x, y) = cOllsl are called isotherms (= lines of constant temperature) and
the curves 'IlF(x, y) = const heat flow lines, because along them, heat flows from higher
to lower temperatures.
It follows that all the examples considered so far (Secs. 18.1. 18.2) can now be
reinterpreted as problems on heat now. The electrostatic equipotential lines <I>(x, Y) = COIlSt
now become isotherms T(x, y) = C01lst, and the lines of electrical force become lines of
heat flow, as in the following two problems.
E X AMP L E 1
Temperature Between Parallel Plates
Find the temperature between two parallel plates x = 0 and x = d in Fig. 403 having temperatures 0 and 100°e.
respectively.
Solution. As in Example I of Sec. 18.1 we conclude that T(x, .1') = ax + b. From the boundary conditions,
h = 0 and a = 1001d. The answer is
T(x.yl
IOU
=
d
\" [0C].
The corresponding complex potential is F(:::) = (1 001d) z. Heat tlows hori70ntally in the negative x-direction
along the lines y = cOllsr.
•
E X AMP L E 1
Temperature Distribution Between a Wire and a Cylinder
Find the temperature field around a long thin wire of radius /'1 = I mm that is electrically heated to Tl = 500°F
and is sUlTounded by a circular cylinder of radius 1'2 = 100 mm. which is kept at temperature T2 = 60°F by
cooling it with air. See Fig. 404. (The wire is at the origin of the coordinme system.)
Solution.
T depends only on r. for reason~ of symmetry. Hence. as in Sec. 18.1 (Example 2).
T(x, .1') = a In r
+ h.
The boundary conditions are
Tl = 500 = a In 1
Hence b = 500 (since In 1 = 0) and
a =
+ b,
T2
= 60 = a
In 100
+ b.
(60 - b)/ln JOO = -95.54. The answer is
T(x, y) = 500 - 95.54 In
I'
[OF].
The isotherms are concentric circles. Heal flows from the wire radially outward to the cylinder. Sketch T as a
•
function of r. Does it look physically reasonable?
YI
Insulated
I
I
--f®J-,
,
x
u~-- ~\
~-I,X
hl~
I
Fig. 403.
Example 1
Fig. 404.
Example 2
o
T=50°C
Fig. 405.
1
Example 3
x
SEC. 18.3
759
Heat Problems
Mathematically the calculations remain the same in the transition to another field of
application. Physically, new problems may arise, with boundary conditions that would
make no sense physically or would be of no practical interest. This is illustrated by the
next two examples.
E X AMP L E 3
A Mixed Boundary Value Problem
Find the temperature distribution in the region in Fig. 405 (cross section of a solid quaI1er-cylinder), whose
vertical pOI1ion of the boundary is at 20De, the horizontal pOI1ion at 50 De, and the circular portion is insulated.
Solution.
The insulated portion of the boundary must be a heat flow line, since by the insulation, heat is
prevented from crossing such a curve, hence heat must flow along the curve. Thus the isotherms must meet
such a curve at right angles. Since T is constant along an isotherm, this means that
aT
(2)
all
=0
along an insulated poI1ion of the boundary.
Here aT/iin is the nonnal derivative of T, that is, the directional derivative (Sec. 9.7) in the direction normal
(perpendicular) to the insulated boundary. Such a problem in which Tis prescribed on one portion of the boundary
and aT/an on the other portion is called a mixed boundary value problem.
In our case, the normal direction to the insulated circular boundary curve is the radial direction toward the
origin. Hence (2) becomes aT/ilr = 0, meaning that along this curve the ~olution must not depend on r. Now
Arg;: = 0 satisfies (1). as well as this condition, and is constant (0 and r./2) on the straight portions of the
boundary. Hence the solution is of the form
T(x, y)
The boundary conditions yield a . r./2
+h
aO + b.
=
= 20 and
a.0 + b
T(x, y)
60
50 - 0,
=
= 50. This gives
0= arctan
r.
y
x
The isotherms are portions of rays 0 = COllSt. Heat flows from the x-axis along circles r = const (daShed in
Fig. 405) Lo the y-axis.
•
y!
;;~\t~ Zf t"lt =~ .
u
0
'"II
"Eo,
T
= O°C
-1 "---Insulated
1
T
= 20°C
x
"
2
(a) z-plane
2"
u
(b) w-plane
Fig. 406.
E X AMP L E 4
"---Insulated
Example 4
Another Mixed Boundary Value Problem in Heat Conduction
Find the temperature field in the upper half-plane when the x-axis is kept at T
for -I < x < I, and is kept at T = 20 e for x> I (Fig.406a).
=
oDe for x
<
-I, is insulated
0
Solution. We map the half-plane in Fig. 406a onto the veI1ical strip in Fig. 406b. find the temperature T*(u. v)
there, and map it back to get the temperature T(x. y) in the half-plane.
The idea of using that strip is suggested by Fig. 388 in Sec. 17.4 with the roles of;: = x + iy and w = u + iv
interchanged. The figure shows that z = sin w maps our present strip onto our half-plane in Fig. 406a. Hence
the inverse function
w
=
I(:::)
=
arcsin:::
760
CHAP. 18
Complex Analysis and Potential Theory
maps that half-plane onto the strip in the w-plane. This is the mapping function that we need according to
Theorem I in Sec. 18.2.
The insulated segment -] < x < ] on the x-axis maps onto the segment -wl2 < II < wl2 on the lI-axi~.
The rest of the x-axis maps onto the two vertical boundary portions 1I = - wl2 and w/2, u > 0, of the strip.
This gives the transformed boundary conditions in Fig. 406b for T*(u. u). where on the insulated hori70ntal
boundary, iJT*/iJ/I = iJT*/iJu = 0 because u is a coordinate normal to that ~egment.
Similarly to Example I we obtain
20
T*(u. u) = 10 + - 1/
which satisfies all the boundary conditions. This is the real part of the complex potential F"(I\')
Hence the complex potential in the ~-plane is
F(~) = F*<.f(~» =
10 +
20
w
10 + (20/w)w.
=
arcsin ~
and T[x, y) = Re F(z) is the solution. The isotherms are 1/ = const in the strip and the hyperbolas in the :.-plane.
perpendicular to which heat flows along the dashed ellipses from the 20°-portion to the cooler 0°-portion of the
boundary. a physically very reasonable result.
•
This section and the last one show the usefulness of conformal mappings and complex
potentials. The latter will also playa role in the next section on fluid flow.
1. CAS PROJECT. Isotherms. Graph isothenns and
lines of heat flow in Examples 2-4. Can you see from
the graphs where the heat flow is very rapid?
2. Find the temperature and the complex potential in an
infinite plate with edges y = x - 2 and y = x + 2 kept
at - 10°C and 20 De, respectively.
3. Find the temperature between two parallel plates \" = 0
and )" = d kept at temperatures ODC and 100De,
respectively. (i) Proceed directly. (ii) Use Example I
and a suitable mapping.
4. Find the temperature T in the sector 0 ~ Arg z ~ w/3,
Izl ~ 1 if T = 20°C on the x-axis, T = 50°C on
y = V3 x, and the curved portion is insulated.
5. Find the temperature in Fig. 405 if T = -20 DC on the
y-axis, T = 100DC on the x-axis, and the circular
portion of the boundary is insulated as before.
6. Interpret Prob. 10 in Sec. 18.2 as a heat flow problem
(with boundary temperatures, say, 20 DC and 300°C).
Along what curves does the heat flow?
7. Find the temperature and the complex potential in the
first quadrant of the ;:-plane if the y-axis is kept at
100°C, the segment 0 < x < I of the x-axis is insulated
and the portion x > I of the x-axis is kept at 200 De.
Hint. Use Example 4.
8. TEAM PROJECT. Piecewise Constant Boundary
Temperatures. (a) A basic building block is shown
in Fig. 407. Find the corresponding temperature and
complex potential in the upper half-plane.
(b) Conformal mapping. What temperature in the
first quadrant of the ;:-plane is obtained from la) by the
mapping w = a + Z2 and what are the transfonned
boundary conditions?
(c) Superposition. Find the temperature T'" and the
complex potential F* in the upper half-plane satisfying
the boundary condition in Fig. 408.
(d) Semi-infinite strip. Applying H' = cosh;: to (c),
obtain the solution of the boundary value problem in
Fig. 409.
v
o
a
Fig. 407.
----~
T*=T 1
U
Team Project 8(a)
v
-1
- - o·
T*=O
Fig. 408.
:1
1
0---
T' =To
T*=O
u
Team Project 8(c)
T=O
T=Tol
00
Fig. 409.
T=O
x
Team Project 8(d)
SEC. 18.4
19-141
761
Fluid Flow
11.
TEMPERATURE DISTRIBUTIONS IN
PLATES
12.Y~
~COO~~
Find the temperature T(x, y) in the given thin metal plate
whose faces are insulated and whose edges are kept at the
indicated temperatures or are insulated as shown.
10.
9.
o
y
13.
oc., VlnSulated
~
~45D
T= lOO°C
x
14.
:\.
';t
T = 50 C""""''''--''''''x
D
18.4
Fluid Flow
Laplace's equation also plays a basic role in hydrodynamics, in steady nonviscous fluid
flow under physical conditions discussed later in this section. In order that methods of
complex analysis can be applied, our problems will be two-dimensional, so that the
velocity vector V by which the motion of the fluid can be given depends only on two
space variables x and y and the motion is the same in all planes parallel to the xy-plane.
Then we can use for the velocity vector V a complex function
(1)
giving the magnitude IVI and direction Arg V of the velocity at each point z = x + iy.
Here VI and V2 are the components of the velocity in the x and y directions. V is tangential
to the path of the moving particles, called a streamline of the motion (Fig. 410).
We show that under suitable assumptions (explained in detail following the examples),
for a given flow there exists an analytic function
(2)
P(z)
= <I>(x, y) + i"'l'(x, y),
called the complex potential of the flow, such that the streamlines are gIVen by
= const, and the velocity vector or, briefly, the velocity is given by
"'l'(x, y)
(3)
V = VI
+ iV2
Fig. 410.
=
p' (z)
Velocity
762
CHAP. 18
Complex Analysis and Potential Theory
where the bar denotes the complex conjugate. \)f is called the stream function. The
function ¢ is called the velocity potential. The curves ¢(x. y) = const are called
equipotential lines. The velocity vector V is the gradient of ¢; by definition, this means
thaI
(4)
Indeed, for F = ¢ + i\)f, Eq. (4) in Sec. 13.4 is F' = ¢x +
second Cauchy-Riemann equation. Together we obtain (3):
Furthermore. since F(z) is analytic, ¢ and
\)f
i\)fx
with
\)fx
=
-¢y
by the
satisfy Laplace's equation
(5)
Whereas in electrostatics the boundaries (conducting plates) are equipotential lines, in
fluid flow the boundaries across which fluid cannot flow must be streamlines. Hence in
fluid flow the stream function is of particular importance.
Before discussing the conditions for the validity of the statements involving (2)-(5), let
us consider two flows of practical interest, so that we first see what is going on from a
practical point of view. Further flows are included in the problem set.
E X AMP L E 1
Flow Around a Corner
The complex potential F(:::) = :::2 = x 2 - y2
+
2ixy models a flow with
Equipotential lines
<I>
= x
Streamlines
'It = 2xv =
2
(Hyperbolas)
- y2 = COllst
(Hyperbolas).
const
From (3) we obtain the velocity vector
v=
2~ =
2(x - iy),
that is,
The speed (magnitude of the velocity) is
The flow may be interpreted as the flow in a channel bounded by the positive coordinates axes and a hyperbola.
say, xy = I (Fig. 411). We note that the speed along a streamline S has a minimum at the point P where the
•
cross section of the channel is large.
o
Fig. 411.
--
x
Flow around a corner (Example 1)
SEC. 18.4
761
Fluid Flow
E X AMP L E 2
Flow Around a Cylinder
Consider the complex potential
F(::.) = <D(x, y)
+
1
jqr(x. y) = ::.
+ -=-
Using the polar form:: = reiO. we obtain
F(::.) = re
iO
+ -;
e -ill
(,.
+
~)
cos ()
+i
(,. -
+)
sin ().
Hence the 'treamlines are
qr(x, y) =
(r - ~ ) sin ()
= COIISt.
In particular. '1'(.1. y) = 0 gives r - 11,. = 0 or sin () = O. Hence this streamline consists of the unit circle (r = IIr
gives r = 1) and the x-axis «(I = 0 and () = 1T). For large 1::1 the term II: in F(:) is small in absolute value, so
that for these ~ the flow is nearly uniform and parallel to the x-axis. Hence we can interpret this as a t10w around
a long circulm cylinder of unit radius that is perpendicular to the ::-plane and intersects it in the unit circle Izl = I
and whose axis cOlTesponds to ::. = O.
The flow has two stagnation points (that is, points at which the velocity V is zero). at;: = :!:: 1. This follows
from (3) and
,
I
F (::.) = I - "2'
(See Fig. 412,)
hence
•
--
x
Fig. 412.
Flow around a cylinder (Example 2)
Assumptions and Theory Underlying (2)-(5)
Complex Potential of a Flow
THEOREM 1
If the dOlllaill of flow is simply connected and the flow is irrotatiollal lind
i1lcompressible, tben the statements inl'Olving (2)--(5) bold. In particular, then the
flow has a complex potential F(;:). which is an analytic function. (Explanation of
tenns below.)
PROOF
We prove this theorem, along with a discussion of basic concepts related to fluid flow.
(a) First Assumption: Irrotational. Let C be any smooth curve in the z-plane given
by .::(s) = xes) + iy(s), where s is the arc length of C. Let the real variable Vt be the
component of the velocity V tangent to C (Fig. 413). Then the value of the real line integral
(6)
764
CHAP. 18
Complex Analysis and Potential Theory
y
x
Fig. 413. Tangential component of the
velocity with respect to a curve C
taken along C in the sense of increasing s is called the circulation of the fluid along C,
a name that will be motivated as we proceed in this proof. Dividing the circulation by the
length of C, we obtain the mean velocit:/ of the flow along the curve C. Now
Vt
Ivi cos a
=
(Fig. 413).
Hence Vt is the dot product (Sec. 9.2) of V and the tangent vector dzlds of C (Sec. 17.1);
thus in (6),
Vt ds
=
(VI -dxds
dY) ds = VI dx + V
+ V2
-
2
ds
dy.
The circulation (6) along C now becomes
JV ds J
(7)
c
=
t
c
(VI dx
+ V2 dy).
As the next idea, let C be a closed curve satisfying the assumption as in Green's theorem
(Sec. 10.4), and let C be the boundary of a simply connected domain D. Suppose further
that V has continuous partial derivatives in a domain containing D and C. Then we can
use Green's theorem to represent the circulation around C by a double integral.
fc
(8)
(VI
d"r + V
2
dy)
=
lj (da:
2
iJ :
-
l
a
)
dx dy.
The integrand of this double integral is called the vorticity of the flow. The vorticity
di vided by 2 is called the rotation
w(x, y) = -1
(9)
2
1 DefillitiollS:
b
J
±J
~a
(aV2
ax
-
-aVI) .
ay
b
f(x) dr = mean value of f on the interval a
~ x ~ b,
a
f(s) ds
= mean value of f
on C
(L
= length of C).
f(x, y) dr dy
= mean value of f
on D
(A
= area of D).
c
±JJ
D
SEC. 18.4
765
Fluid Flow
We assume the flow to be irrotational, that is, w(x, y)
aVI
(10)
ay
=
== 0
throughout the flow; thus,
o.
To understand the physical meaning of vorticity and rotation, take for C in (8) a circle.
Let r be the radius of C. Then the circulation divided by the length 27Tr of C is the mean
velocity of the fluid along C. Hence by dividing this by r we obtain the mean angular
velocity Wo of the fluid about the center of the circle:
Wo
= -127Tr2
Jf (- - D
aV2
ax
aVI) dx dy = -1-.-
7Tr2
rly
Jf
w(x. y) dx dy.
D
If we now let r - 0, the limit of Wo is the value of w at the center of C. Hence w(x, y)
is the limiting angular velocity of a circular element of the fluid as the circle shrinks to
the point (x, y). Roughly speaking, if (l spherical element of the fluid were suddenly
solidified and the surrou1lding fluid simultaneously annihilated, the elemellt would rotate
WiTh the angular velocity w.
(b) Secolld Assumption: Incompressible. Our second assumption is that the fluid is
incompressible. (Fluids include liquids, which are incompressible, and gases, such as air,
which are compressible.) Then
(11)
in every region that is free of sources or sinks, that is, points at which fluid is produced
or disappears. respectively. The expression in (11) is called the divergence of V and is
denoted by div V. (See also (7) in Sec. 9.8.)
(c) Complex Velocity Potelltial. If the domain D of the flow is simply connected
(Sec. 14.2) and the flow is irrotational, then (10) implies that the line integral (7) is
independent of path in D (by Theorem 3 in Sec. lO.2, where FI = VI, F2 = V2, F3 = 0,
and.: is the third coordinate in space and has nothing to do with our present z). Hence if
we integrate from a fixed point (a, b) in D to a variable point (x. y) in D, the integral
becomes a function of the point (x, y), say, el>(x. y):
lx, yJ
(12)
(Nx. y)
=
J
(VI dx
+
V2 dy).
(a, b)
We claim that the flow has a velocity potential el>, which is given by (12). To prove this.
all we have to do is to show that (4) holds. Now since the integral (7) is independent of
path. VI dt + V2 dy is exact (Sec. lO.2). namely, the differential of el>. that is.
VI dx
+
V2 dy
ael>
= -
ax
dx
ael>
+ -
ay
dy.
From this we see that VI = ael>/ax and V2 = ael>/ay, which gives (4).
That el> is harmonic follows at once by substituting (4) into (11), which gives the first
Laplace equation in (5).
766
CHAP. 18
Complex Analysis and Potential Theory
We finally take a harmonic conjugate 'l! of <D. Then the other equation in (5) holds.
Also, since the second partial derivatives of ¢ and 'l! are continuous, we see that the
complex function
F(:;)
=
¢(x. y)
+ i'l!(x. y)
is analytic in D. Since the curves 'l!(x, y) = const are perpendicular to the equipotential
curves ¢(x, y) = const (except where F' (z) = 0), we conclude that 'l!(x, y) = const are
the streamlines. Hence 'l! is the stream function and F(::) is the complex potential of the
flow. This completes the proof of Theorem I as well as our discussion of the important
•
role of complex analysi<; in compressible fluid flow.
L
.....
_-.
_
. . . . -ill . .
..._ .......
lA
-:
FLOW PATTERNS: STREAMLINES,
COMPLEX POTENTIAL
7. What F(z) would be suitable in Example I if the angle
of the comer were 7f/3?
problems should encourage you to experiment with
various functions FC:), many of which model interesting
flow patterns.
8. Sketch or graph the streamlines and equipotential lines
of F(:;.) = ;:3. Find V. Find all points at which V l~
parallel to the x-axis.
11-151
The~e
1. (Parallel flow J Show that F(~) = -; K~ (K positive
real) describes a uniform flow upward. which can be
interpreted as a uniform flow between two parallel lines
(parallel planes in three-dimensional space). See
Fig. 414. Find the velocity vector, the streamlines. and
the equipotential lines.
9. Find and graph the streamlines of F(:)
=
:2
+ 2:.
Interpret the flow.
10. Show that F(:) = i: 2 models a flow around a comer.
Sketch the streamlines and equipotential lines. Find V.
=
11. (Potential F(z)
lIz) Show that the streamlines of
F(~) = l/~ are circles through the origin.
12. (Cylinder) What happens in Example 2 if you replace
::: by Z2? Sketch and interpret the resulting flow in the
first quadrant.
13. Change F(~) in Example 2 slightly to obtain a flow
around a cylinder of radius 1"0 that gives the flow in
Example 2 if 1"0 ~ I.
14. (Aperture) Show that F(:) = arccosh : gives confocal
hyperbolas as streamlines, with foci at ::: = ::':: I. and the
flow may be interpreted as a flow through an aperture
(Fig. 415).
x
Fig. 414.
Parallel flow in Problem 1
2. (Conformal mapping) Obtain the now in Example I
from that in Prob. 1 by a suitable conformal mapping.
15. (Elliptical cylinder) Show that F(~) = arccos: gives
confocal ellipses as streamlines. with foci at : = ::':: 1.
and that the now circulates around an elliptic cylinder
or a plate (the segment from -1 to I in Fig. 416).
3. Find the complex potential of a uniform flow parallel
to the x-axis in the positive x-direction.
4. What happens to the flow in Prob. I if you replace
by ~e-ia with constant 0'. e.g., 0' = 7f/4?
z
5. What is the complex potential of an upward parallel
flow in the direction of y = 2x?
6. (Extension of Example l) Sketch or graph the flow in
Example I on the whole upper half-plane. Show that
you can interpret it as as flow against a horizontal wall
(the x-axis).
Fig. 415.
Flow through an aperture in Problem 14
SEC. 18.4
767
Fluid Flow
(d) Source and sink combined. Find the complex
potentials of a now with a source of strength 1 at
::: = - 0 and of a flow with a sink of strength I at
::: = 1I. Add both and sketch or graph the streamlines.
Show that for small 101 these lines look similar to those
in Prob. II.
Fig. 416.
Flow around a plate in Problem 15
(e) Flow with circulation around a cylinder. Add the
potential in (b) to that in Example 2. Show that this gives
a flow for which the cylinder wall 1::.:1 = I is a streamline.
Find the speed and show that the stagnation points are
iK
47T
x
Fig. 417.
Point source
Fig. 418.
Vortex flow
16. TEAM PROJECT. Role of the Natural Logarithm
in Modeling Flows. (a) Basic flo\\s: Source and sink.
Show that F(:;;) = (cl27T) In:;; with constant positive
real c gives a flow directed radially outward (Fig. 417),
so that F models a point source at ~. = 0 (that is, a
source line x = 0, y = 0 in space) at which t1uid is
produced. c is called the strength or discharge of the
source. If c is negati ve real. show that the flo,,' is
directed radially inward. so that F models a sink at
:;; = 0, a point at which fluid disappears. Note that
:;; = 0 is the singular point of F(:;;).
(b) Basic flows: Vortex. Show that n:;;) = -(Ki/27T)
In::: with positive real K gives a t10w circulating
counterclockwise around::: = 0 (Fig. 418), :;; = 0 is
called a vortex. Note that each time we travel around
the vonex. the potential increases by K.
(c) Addition of flOl\s. Show that addition of the
velocity vectors of two flows gives a flow whose
complex potential is obtained by adding the complex
potentials of those flows.
if K = 0 they are at ± 1; as K increases they move up
on the unit circle until they unite at :;; = i (K = 47T, see
Fig. 4(9), and if K > 47T they lie on the imaginary axis
(one lies in the field of flow and the other one lies inside
the cylinder and has no physical meaning).
Fig. 419. Flow around a cylinder without
circulation (K = 0) and with circulation
768
18.5
CHAP. 18
Complex Analysis and Potential Theory
Poisson's Integral Formula for Potentials
So far in this chapter we have seen that complex analysis offers powerful methods for
modeling and solving two-dimensional potential problems based on conformal mappings
and complex potentials. A further method results from complex integration. As a most
important result it yields Poisson's integral formula (5) for potentials in a standard domain
(a circular disk) and from (5) a useful series (7) for these potentials. Hence we can solve
problems for disks and then map solutions conforma\ly onto other domains.
Poisson's formula will follow from Cauchy's integral formula (Sec. 14.3)
1
F(::J = - .
(I)
27Tt
f -F(z*)
C
z* - ;:
d~*.
Here C is the circle z* = Reia lcounterclockwise. 0 ~ a ~ 27T), and we assume that F(z*)
is analytic in a domain containing C and its full interior. Since dz* = iReia da = iz* da,
we obtain from (l)
I
2~
F(z) = - ] F(z*)
(2)
27T
0
*
_z_
z* - Z
da
Now comes a little trick. If instead of z inside C we take a Z outside C. the integrals (1)
and (2) are zero by Cauchy's integral theorem (Sec. 14.2). We choose Z = z* E* /~ = R2/~,
which ic; outside C because Izi = R2/lzl = R2/r > R. From (2) we thus have
o= -1
27T
]2,,-F(z*)
0
-*
da
<.
Z* - Z
I
]2~
27T
0
= -
z*
F(z*) - - - - da
z*E*
z* - - Z
and by straightforward simplification of the last expression on the right,
o=
1
2~
-
- ] F(::*)
27T
0
~
Z - z*
da.
We subtract this from (2) and use the following formula that you can verify by direct
calculation (zz* cancels):
(3)
z*
z
z* - Z
.;: - z*
.:*z* - z.;:
(z* - z)(z* - f)
We then have
(4)
F(.::)
1 ]2~
= -2
7T
0
F(z*)
.::*2"* - z.;:
(z* -
z)(~*
-
~)
da.
From the polar representations of z and z* we see that the quotient in the integrand is real
and equal to
R2 - 2Rr cos
«() -
a)
+
r2
SEC. 18.5
Poisson's Integral Formula for Potentials
769
We now write F(z) = cI>(r, 8) + i'l'(r, 8) and take the real part on both sides of (4).
Then we obtain Poisson's integral formula 2
I
(5)
cI>(,., 8) = 27T
R2 - r2
cI>(R, a) R2 _ 2Rr cos (8 _ a)
L
27T
+
,.2 da.
This formula represents the harmonic function cI> in the disk Izl ~ R in terms of its values
cI>(R, a) on the boundary (the circle) Izl = R.
Formula (5) is still valid ifthe boundary function cI>(R. a) is merely piecewise continuous
(as is practically often the case: see Fig. 401 in Sec. 18.2 for an example). Then (5) gives
a function harmonic in the open disk. and on the circle Izl = R equal to the given boundary
function. except at points where the latter is discontinuous. A proof can be found in
Ref. [D1] in App. 1.
Series for Potentials in Disks
From (5) we may obtain an important series development of cI> in terms of simple harmonic
functions. We remember that the quotient in the integrand of (5) was derived from (3).
We claim that the right side of (3) is the real pat10f
z*
+
(z*
Z
+ z)(':*
z*z* -
- z)
zz -
z*z +
~z*
(z* - z)(z* - Z)
z* - z
Indeed, the last denominator is real and so is z*z*
zz m the numerator, whereas
-z*z + zz* = 2i 1m (zz*) in the numerator is pure imaginary. This verifies our claim.
Now by the use of the geometric series we obtain (develop the denominator)
(6)
Since z
1 + (z/z*)
1 - (z/z*)
z* + z
-* <.
=
~
,.
rei(J
=
and z*
Re [(
ReiD<.
z~ TJ = Re [ ; :
On the right, cos (118 - na)
,.* + Re ----,.
,.-*
we have
- -
(.
1
einfie-mD<
=
cos n8 cos na
+
2
~1 Re (z~
+
2
~1
(6*)
(;
r
+
J (;
r
cos (n8 - na).
sin n8 sin na. Hence from (6) we obtain
r
(cos n8 cos lla
+ sin n8 sin na).
2SLMEON DENIS POISSON (1781-1840), French mathematician and physicist, professor In Paris from 1809.
His work includes potential theory, partial differential equations (Pois~on equation, Sec. 12.1), and probability
(Sec. 24.7).
770
CHAP. 18
Complex Analysis and Potential Theory
This expression is equal to the quotient in (5), as we have mentioned before, and by
inserting it into (5) and integrating term by term with respect to a from 0 to 27T we obtain
(7)
cI>(r, 8)
=
+ ~1
ao
(~
r
(lln
cos nfl
+ bn
sin nfl)
where the coefficients are [the 2 in (6*) cancels the 2 in 1/(27T) in (5)]
ao
=
27T
f
1
2.".
an
cI>(R, a) da,
=
0
7T
(8)
1
f
= -
bn
7T
f
27T
cI>(R, a) cos lIa da,
0
11
27T
=
1,2.....
cI>(R. a) sin na da.
0
the Fourier coefficients of cI>(R, a); see Sec. 11.1. Now for r = R the series (7) becomes
the Fourier series of cI>(R, a). Hence the representation (7) will be valid whenever the
given cI>(R. a) on the boundary can be represented by a Fourier series.
E X AMP L E 1
Dirichlet Problem for the Unit Disk
Find the electrostatic potential <P(r, 0) in the unit disk r < I having the boundary values
<P(l,
Solution.
{
-Cd'7T
if
al7T
if
-7T
< a < 0
(Fig. 420).
0
< a <
=
=
[-
2 2
-4/(1l 7T )
I~.". -; cos da + L""-; cos da]
I
2
-
-
=
l1a
(cos 117T
112:2
-
1).
= () if 11 = 2.4, ... , and the potential is
if 11 is odd. an
<P(r 0) =
,
/la
To
=! and
Since <PO, a) is even. bn = O. and from (8) we obtain ao
an ~
Hence. an
a) =
4
[
7T 2
5
,.
cos 0
+ -r32
3
cos 38
+ -r
52
cos 58
+ ... ]
Figure 421 shows the unit disk and some of the equipotential lines (curves <P = const).
•
x
-71
Fig. 420.
o
71
a
Boundary values in Example 1
Fig. 421.
Potential in Example 1
_
SEC. 18.6
771
General Properties of Harmonic Functions
.....
---.-!7T
1. Verify (3).
14. TEAM PROJECT. Potential in a Disk. (a) Mean
3. Give the details of the derivation of the series (7) from
value property. Show that the value of a harmonic
function <P at the center of a circle C equals the mean
of the value of <P on C (see Sec. 18.4, footnote I, for
definitions of mean values).
the Poisson formula (5).
14-131
HARMONIC FUNCTIONS IN A DISK
Using (7), find the potential <P(r, 8) in the unit disk r < I
having the given boundary values <P(l, 8). Using the sum
of the fIrst few terms of the series. compute some values
of <P and sketch a figure of the equipotential lines.
4. <P(l, 8)
=
(b) Separation of variables. Show that the terms of
(7) appear as solutions in separating (he Laplace
equation in polar coordinates.
sin 28
(c) Harmonic conjugate. Find a series for a harmonic
conjugate 'It of <P from (7).
S. <P(1, e) = 2 sin 2 fJ
6. <P(l, 8)
=
cos 2 58
7. <P(\, 8) = 8 if
(d) Power series. Find a series for F(::.) = <P
-7T
< 8<
8. <P(I, 8)
=
8 if 0 < 8 <
9. <P(I. 8)
=
sin 3 28
10. <P(l, 8)
11. <P(I, 8)
= cos 4
12. <P(L 8)
<P(l, 8)
=
18.6
=
=
8 if
-7T
< 8<
-!7T
+ i'lt.
15. CAS EXPERIMENT. Series (7). Write a program for
series developments (7). Experiment on accuracy by
computing values from partial sums and comparing
them with values that you obtain from your CAS graph.
Do (his (a) for Example I and Fig. 421, (b) for <P in
Prob. 8 (which is discontinuous on the boundary!),
(c) for a <P of your choice with continuous boundary
values. (d) for <P with discontinuous boundary values.
7T
27T
8
2
!7T,
13. <P(L 8) = 8 if
< 8<
<P(1. 8) = 7T - 8 it!7T < 8 < ~7T
2. Show that every term in (7) is a harmonic function in
the disk r < R.
7T
!7T.
1 if
< 8<
0 it!7T < 8 < ~7T
General Properties of Harmonic Functions
General properties of hamlOnic functions can often be obtained from properties of analytic
functions in a simple fashion. Specifically, important mean value properties of harmonic
functions follow readily from those of analytic functions. The details are as follows.
THEOREM 1
Mean Value Property of Analytic Functions
Let fez) be analytic in a simply connected domain D. Then the value of F(z) at a
point Zo in D is equal to the mean value of Hz) on any circle in D with center at zoo
PROOF
In Cauchy's integral formula (Sec. 14.3)
(1)
I
F(Zo) = - - .
27Tt
we choose for C the circle
(1) becomes
z=
(2)
F(zo)
::'0
+
F(z.)
r1 ----=-=dz
Z
C
";0
re ia in D. Then::. -
1
= -2
7T
f
0
z.o =
27T
F(::,o
+ re ia ) dO'.
reia, dz = ire ia dO', and
772
CHAP. 18
Complex Analysis and Potential Theory
The right side is the mean value of F on the circle (= value of the integral divided by the
length 27T of the interval of integration). This proves the theorem.
•
For harmonic functions, Theorem 1 implies
THEOREM 2
Two Mean Value Properties of Harmonic Functions
Let <P(x, y) be harmonic in a simply connected d()main D. Then the value of
<P(x, y) at a point (xo, Yo) in D is equal to the mean value of <P(x, Y) Oil any circle
ill D with center at (xo, Yo). This value is also equal to the mean value of <P(x, y)
0/1 any circular disk in D with center (xo, Yo). [See footnote I in Sec. 18.4.]
PROOF
The first part of the theorem follows from (2) by taking the real parts on both sides,
I
<P(xo, Yo) = Re F(xo
+
(\'0)
= -2
7T
i
27T
<P(xo
+
r cos
0',
Yo
+
r sin
0')
dO'.
0
The second part of the theorem follows by integrating this formula over r from 0 to ro
(the radius of the disk) and dividing by r0212.
1
(3)
<P(xo, Yo) =
--2
7Tro
i Lr7T
"0
0 0
<P(xo
+ ,. cos 0', Yo +
r sin 0')" dO' dr.
The right side is the indicated mean value (integral divided by the area of the region of
integration).
•
Returning to analytic functions, we state and prove another famous consequence of
Cauchy's integral formula. The proof is indirect and shows quite a nice idea of applying
the ML-inequality. (A bounded regio/1 is a region that lies entirely in some circle about
the origin.)
THEOREM 3
Maximum Modulus Theorem for Analytic Functions
Let F(z) be analytic and nonCOllstant i/1 a dOlllain containing (/ bounded region R
and its boundary. Then the absolute value IF(z)1 cannot have a maximum at a/1
il1terior point of R. Consequently, the maximulIl of IF(z)1 is taken on the boundnry
of R. If F(z) -=1= 0 in R, the same is true with respect to the minimum l!lIF(z)l.
PROOF
We assume that IF(z)1 has a maximum at an interior point:::o of R and show that this leads
to a contradiction. Let IF(zo)1 = M be this maximum. Since F(z) is not constant, IF(:::)I is
not constant, as follows from Example 3 in Sec. 13.4. Consequently. we can find a circle
C of radius r with center at zo such that the interior of C is in R and IF(z)1 is smaller than
M at some point P of C. Since iF(z) I is continuous, it will be smaller than M on an arc
C 1 of C that contains P (see Fig. 422), say,
IF(z) I ~ M - k
(k > 0)
for all
z on C1 .
SEC. 18.6
General Properties of Harmonic Functions
773
Fig. 422.
Proof of Theorem 3
Let C 1 have the length L 1 . Then the complementary arc C2 of C has the length 27TT - L 1 .
We now apply the ML-inequality (Sec. 14.1) to (I) and note that Iz - zol = r. We then
obtain (using straightforward calculation in the second line of the formula)
M = IF(zo) I ;;::
-1
27T
IJ -=-=-F(::.) dz I + - 1 IJ -_F(z) d::. I
C1
"
Zo
27T
C2 Z
Zo
that is, M < M, which is impossible. Hence our a<;sllmption is false and the first statement
is proved.
Next we prove the second statement. If F(z) *- 0 in R, then lIF(z) is analytic in R.
From the statement already proved it follows that the maximum of 11 IF(z) I lies on the
boundary of R. But this maximum corresponds to the minimum of IF(z)l. This completes
the proof.
•
This theorem has several fundamental consequences for harmonic functions, as follows.
THEOREM 4
Harmonic Functions
Let <1>(x, y) be lza17110nic in a domain containing a simply connected bounded region
R and its bOllndary curve C. Then:
(I) (Maximum principle) If <1>(x, y) is not COli stant, it has neither a maximum
nor a minimulIl in R. Consequently, the I1ULyimU111 and the minimulll are taken on
the boundary of R.
(II)
~f <1>(.1', y)
is constant on C, thell <1>(.1", y) is a constant.
(III) If h(x, y) is harmonic in R alld on C and
h(x, y) = <1>(x, y) everywhere in R.
PROOF
!f h(x, y) =
<1>(x, y) on C, then
(I) Let 'IJf(x, y) be a conjugate harmonic function of <1>(x, y) in R. Then the complex
function F(::;) = <1>(x, y) + i'lJf(x, y) is analytic in R, and so is G(z) = eF(Z). Its absolute
value is
IG(;:) I = eRe F(z)
=
e'Nx. y).
From Theorem 3 it follows that IG(z)1 cannot have a maximum at an interior point of R.
Since e<P is a monotone increasing function of the real variable <1>, the statement about the
CHAP. 18
774
Complex Analysis and Potential Theory
maximum of <D follows. From this, the statement about the minimum follows by replacing
<1> by -<D.
(II) By (I) the function <D(.\, y) takes its maximum and its minimum on C. Thus, if
y) is constant on C, its minimum must equal its maximum, so that <D(.\", y) must be
a constant.
<D(.\",
(III) If II and <D are harmonic in R and on C. then h - <D is also harmonic in Rand
on C, and by assumption, II - <D = 0 everywhere on C. By (II) we thus have /J - <D = 0
everywhere in R. and (Ill) is proved.
•
The last statement of Theorem 4 is very important. It means that a IW17110nic jilllction is
uniquely determi1led in R by its values Oil the boundary (?f" R. Usually, <1>(.\, y) is required
to be harmonic in R and continuous on the boundary of R. that is,
lim <D(x, y) = <D(xo, Yo), where (xo, Yo) is on the boundary and (x, y) is in R.
X-Xo
y~yo
Under these assumptions the maximum principle (I) is still applicable. The problem of
determining <D(x, y) when the boundary values are given is called the Dirichlet problem
for the Laplace equation in two variables, as we know. From (Ill) we thus have, as a
highlight of our discussion.
THEOREM 5
Uniqueness Theorem for the Dirichlet Problem
Iflor a given region and given boulldary mlues the Dirichlet problem for the Laplace
equatio/l in two l'ariables has a solutioll. the solution is ullique.
---..
.... -~
...-.-----
kl 2 around the unit circle. Does your result
contradict Theorem I?
1. Integrate
12-" 1 VERIFY THEOREM 1 for the given F(::),
::0'
and
circle of radius l.
2. (::: + 1)3':::0 = 2
3. (::: - 2)2. ::0 = ~
4. 10:::
15-71
4
'':0
=
0
VERIFY THEOREM 2 for the given cI>(x. y).
(xo. Yo) and Circle of radius I.
5. (x - 2)(y - 2), (4. -4)
6. x 2
-
.'"2. (3, 8)
7. x 3
-
3xy2, (I, I)
8. Derive Theorem 2 from Poisson's integral formula.
9. CAS EXPERIMENT. Graphing Potentials. Graph
the potentials in Probs. 5 and 7 and for three other
functions of your choice as smfaces over a rectangle
or a disk in the xy-plane. Find the locations of maxima
and minima by inspecting these graphs.
10. TEAM PROJECT. Maximum Modulus of Analytic
Functions. (a) Verify Theorem 3 for (i) F(:::) = ::2 and
the square 4 ~ x ~ 6. 2 ~ Y ~ 4, (ii) F(:::) = e1z and
any bounded domain. (iii) F(:::) = sin:: and the unit
disk.
(b) F(x) = cos x (x real) has a maximum I at O.
How does it follow that this cannot be a maximum of
IFez.) I = Icos:::1 in a domain containing Z = O?
(c) F(::) = 1 + 1::12 is not Lero in the disk Izi ~ 4 and
has a minimum at an interior point. Does this contradict
Theorem 3?
(d) If F(:::) is analytic and not constant in the closed
unit disk D: 1<:1 ~ 1 and IF(.;;)I = C = COIlSt on the unit
circle. show that F(.:) must have a zero in D. Can you
extend this to an arbitrary simple closed curve?
775
Chapter 18 Review Questions and Problems
Ill-13
I
MAXIMUM MODULUS
Find the location and si7e of the maximum of IF(;:)I in the
unit disk 1:<:1 ;::; I.
11. F(;:) = ;:2 -
12. F(;:)
=
a;:
1
+ b la,
b complex)
13. F(;:) = cos 2;:
14. Verify the maximum principle for <1>(x. y) = eX cos y
and the rectangle a ;::; x ;::; b. 0 ;::; y ;::; 2'iT.
. .
=
.-.... •.-. ... : .-======
--
.~
15. (Conjugate) Do ¢ and a harmonic conjugate \)! of (I) in
a region R have their maximum at the same point of R?
16. (Conformal mapping) Find the location (u l . VI) of the
maximum of <1>* = ell cos V in R*: Iwl ;::; 1. V ~ 0,
where w = 1I + iv. Find the region R that is mapped
onto R* by w = f(;:) = ;:2. Find the potential in R
resulting from <1>* and the location (xl' .\'1) of the
maximum. Is (lib VI) the image of (Xl' YI)? If so, is
this ju~t by chance?
•• , -:..ll£STIONS AND PROBLEMS
1. Why can potential problems be modeled and solved by
complex analysis? For what dimensions?
2. What is a harmonic function? A harmonic conjugate?
3. Give a few example~ of potential problems considered
in this chapter.
4. What is a complex potential? What does it give
physically'?
5. How can conformal mapping be used in connection with
the Dirichlet problem?
6. What heat problems reduce to potential problems? Give
a few examples.
7. Write a short essay on potential theory in fluid flow
from memory.
8. What is a mixed boundary value problem? Where did
it occur?
9. State Poisson's formula and its derivation from
Cauchy's formula.
10. State the maximum modulus theorem and mean value
theorems for harmonic functions.
11. Find the potential and complex potential between the
plates y = x and y = x + 10 kept at 10 V and 110 V,
respecti vel y.
12. Find the potential between the cylinders 1:1 = I cm
having potential 0 and Id = 10 em having potential 20
kV.
13. Find the complex potential in Prob. 12.
14. Find the equipotential line U = 0 V between the
cylinders Id = 0.2S cm and 1:1 = 4 cm kept at -220 V
and 220 V. respectively. (Guess first.)
15. Find the potential between the cylinders Izl = 10 cm
and 1:::1 = 100 cm kept at the potentials 10 kV and 0,
respectively.
16. Find the potential in the angular region between the
plates Arg ::: = 'iT16, kept at 8 k V, and Arg ;: = 'iT13, kept
dt 6kV.
17. Find the equipotential lines of F(;:)
= i
Ln z..
18. Find and sketch the equipotential lines of
F(:::) = (l
+
i)/:::.
19. What is the complex potential in the upper half-plane
if the negative half of the x-axis has potential I kV and
the positive half is grounded?
20. Find the potential on the ray)' = x, x > 0, and on
the positive half of the x-axis if the positive half of
the .v-axis is at 1200 V and the negative half IS
grounded.
21. Interpret Prob. 20 as a problem in heat conduction.
22. Find the temperature in the upper half-plane if the
portion x > 2 of the x-axis is kept at SO"C and the other
portion at O°C.
23. Show that the isotherm~ of Fr;:) =
hyperbolas.
_;:::2
+ ;: are
24. If the region between two concentric cylinders of radii
2 cm and 10 cm contains water and the outer cylinder
is kept at 20°C, to what temperature must we heat the
inner cylinder in order [0 have 30°C at distance 5 cm
from the axis?
25. What are the streamlines of Fr:::) = i/;:?
26. What is the complex potential of a flow around a
cylinder of radius 4 without circulation?
27. Find the complex potential of a source at ::: = 5. What
are the streamlines?
28. Find the temperature in the unit disk 1:1 ;::; I in the form
of an infinite series if the left semicircle of Izi = 1 has
the temperature of SO°C and the right semicircle has the
temperature DoC.
29. Same task as in Prob. 2~ if the upper semicircle is at
40°C and the lower at DoC.
30. Find a series for the potential in the unit disk with
boundary values <1>( I, 0) = 0 2 (-'iT < 0 < 'iT).
776
CHAP. 18
Complex Analysis and Potential Theory
......
•
~
,,~--
Complex Analysis and Potential Theory
Potential theory is the theory of solutions of Laplace's equation
(1)
Solutions who~e second pattial derivatives are continuous are called harmonic
functions. Equation (I) is the most important PDE in physics, where it is of interest
in two and three dimensions. It appears in electrostatics (Sec. 18.1), steady-state heat
problems (Sec. 18.3), fluid flow (Sec. 18.4), gravity, etc. Whereas the three-dimensional
case requires other methods (see Chap. 12), two-dimensional potential theory can
be handled by complex analysis. since the real and imaginary parts of an analytic
function are harmonic (Sec. 13.4). They remain harmonic under conformal mapping
(Sec. 18.2), so that conformal mapping becomes a powerful tool in solving
boundary value problems for (1), as is illustrated in this chapter. With a real potential
<P in (1) we can associate a complex potential
(2)
F(z) = <P
+
i\If
(Sec. 18.1).
Then both families of curves <P = const and \If = COllst have a physical meaning.
In electrostatics, they are equipotential lines and lines of electrical force (Sec. 18.1).
In heat problems. they are isotherms (curves of constant temperature) and lines of
heat flow (Sec. 18.3). In fluid flow, they are equipotential lines of the velocity
potential and stream lines (Sec. 18.4).
For the disk. the solution of the Dirichlet problem is given by the Poisson formula
(Sec. 18.5) or by a series that on the boundary circle becomes the Fourier series of
the given boundary values (Sec. 18.5).
Hatmonic functions. like analytic functions, have a number of general properties:
particularly important are the mean value property and the maximum modulus
property (Sec. 18.6), which implies the uniqueness of the solution of the Dirichlet
problem (Theorem 5 in Sec. 18.6).
,
PA RT
••
E
•
•1
,
t
-
.(
.,
..""
~
(
Numeric
Analysis
Software (p. 778-779)
CHAPTER 19
Numerics in General
CHAPTER 20
Numeric Linear Algebra
CHAPTER 21
Numerics for ODEs and PDEs
Numeric analysis, more briefly also called numerics, concerns numeric methods. that
is, methods for solving problems in terms of numbers or corresponding graphical
representations. It also includes the investigation of the range of applicability and of the
accuracy and stability of these methods.
Typical tasks for numerics are the evaluation of definite integrals. the solution of equations
and linear systems, the solution of differential or integral equations for which there are
no solution formulas, and the evaluation of experimental data for which we want to obtain.
for example. an approximating polynomial.
Numeric methods then provide the transition from the mathematical model to an
algorithm, which is a detailed stepwise recipe for solving a problem of the indicated kind
to be programmed on your computer, using your CAS (computer algebra system) or other
software, or on your programmable calculator.
In this and the next two chapters we explain and illustrate the most frequently used basic
numeric methods in algorithmic form. Chapter 19 concerns numerics in general; Chap. 20
numeric linear algebra. in particular, methods for linear systems and matrix eigenvalue
problems; and Chap. 21 numerics for ODEs and PDEs.
The algorithms are given in a form that seems best for showing how a method works. We
suggest that you also make use of programs from public-domain or commercial software
listed on pp. 778-779 or obtainable on the lnternet.
778
PART E
Numeric Analysis
Numerics has increased in importance to the engineer more than any other field of
mathematics owing to the ongoing development of powerful software resulting from great
research activity in numerics: new methods are invented, existing methods are improved
and adapted. and old methods-impractical in precomputer times-are rediscovered. A
main goal in these activities is the development of well-stmctured software. And in largescale work-millions of equations or steps of iteration--even small algorithmic
improvements may have a large effect on computing time, storage demand, accuracy, and
stability.
On average this makes the algorithms used in practice more and more complicated.
However, the more sophisticated modem software will become, the more important it
will be to understand concepts and algorithms in a basic form that shows original
motivating ideas of recent developments.
To avoid misunderstandings: Various simple classical methods are still very useful in
many routine situations and produce satisfactory results. In other words, not everything
has become more sophisticated.
Software
See also http://www.wiley.com/college/kreyszig/
The following list will help you if you wish to find software. You may also obtain
information on known and new software from magazines, such as Byte Magazine or PC
Maga::,ille, from articles published by the Americall Mathematical Society (see also their
website at www.ams.org).theSocietyforlndustrialandAppliedMathematics(SIAM.at
www.siam.org). the Association for COl1lpllting Machillel:r (ACM. at www.acm.org), or
the Institute of Electrical and Electronics Engineers (IEEE, at www.ieee.org). Consult
also your library, Computer Science Department, or Mathematics Department.
Derive. Texas Instntments, Inc .. Dallas, TX. Phone 1-800-842-2737 or (972) 917-8324,
website at www.derive.com or www.education.ti.com.
EISPACK. See LAPACK.
GAMS (Guide to Available Mathematical Software). Website at http://gams.nist.gov.
On-line cross-index of software development by NIST. with links to IMSL. NAG, and
NETUB.
IMSL (International Mathematical and Statistical Library). Visual Numerics, Inc.,
Housron. TX. Phone \-800-222-4675 or (713) 784-3131. website at www.vni.com.
Mathematical and statistical Fortran routines with graphics.
LAPACK. Fortran 77 routines for linear algebra. This software package supersedes
UNPACK and EISPACK. You can download the routines
(see http://cm.bell-labs.comlnetliblbib/minors.html) or order them directly from NAG.
The LAPACK User's Guide is available at www.netlib.org.
PART E
Numeric Analysis
779
LINPACK see LAPACK
Maple. Waterloo Maple, Inc., Waterloo, ON, Canada. Phone 1-800-267-6583 or
(519) 747-2373, website at www.maplesoft.com.
Maple Computer Guide. For Advanced Engineering Mathematics, 9th edition. By
E. Kreyszig and E. J. Norminton. J. Wiley and Sons. [nc .. Hoboken. NJ. Phone
1-800-225-5945 or (201) 748-6000.
Mathcad. MathSoft, Inc., Cambridge, MA. Phone 1-800-628-4223 or (617) 444-8000.
website at www.mathcad.com or www.mathsofLcom.
Mathematica. Wolfram Research, Inc., Champaign. IL. Phone 1-800-965-3726 or
(217) 3lJ8-0700, website at www.wolframresearch.com.
Mathematica Computer Guide. For Advanced Engineering Mathematics. lJth
edition. By E. Kreyszig and E. J. Norminton. J. Wiley and Sons. Inc .. Hoboken. NJ. Phone
1-800-225-5945 or (201) 748-6000.
Matlab. The MathWorks, Inc., Natick, MA. Phone (508) 647-7000, website at
www.mathworks.com.
NAG. Numerical Algorithms Group, Inc., Downders Grove. IL. Phone (630) 971-2337,
website at www.nag.com. Numeric routines in Fortran 77, Fortran 90, and C.
NETLIB. Extensive library of public-domain software. See at www.netIib.org and
http://cm.bell-Iabs.comlnetlib/.
NIST. National Institute of Standards and Technology, Gaithersburg, MD. Phone
(301) 975-2000, website at www.nist.gov. For Mathematical and Computational Science
Division phone (301) 975-3800. See also http://math.nist.gov.
Numerical Recipes. Cambridge University Press, New York. NY. Phone (212) 924-3900.
website at www.us.cambridge.org. Books (also source codes on CD ROM and
discettes) containing numeric routines in C, C++. Fortran 77, and F0l1ran 90. To order.
call office at West Nyack. NY, at 1-800-872-7423 or (845) 353-7500 or online at
www.numerical-recipes.com.
FURTHER SOFTWARE IN STATISTICS. See Part G.
~
I
:·~f
~.._I
,
....
~
"1
. ... . /, -. "«:-
CHAPTER
19
i
....J
Numerics in General
This first chapter on numerics begins with an explanation of some general concepts. such
as floating point, roundoff errors, and general numeric errors and their propagation. In
Sec. 19.2 we discuss methods for solving equations. Interpolation methods, including
splines, follow in Secs. 19.3 and 19.4. The last section (19.5) concerns numeric integration
and differentiation.
The purpose of this chapter is twofold. First, for all these tasks the student should
become familiar with the most basic (but not too complicated) numeric solution methods.
These are indispensable for the engineer, because for many problems there is no solution
formula (think of a complicated integral or a polynomial of high degree or the interpolation
of values obtained by measurements). In other cases a complicated solution formula may
exist but may be practically useless.
Second. the student should learn to understand some ba<;ic ideas and concepts that are
important throughout numerics, such as the practical form of algorithms. the estimation
of errors, and the order of convergence.
Prerequisite: Elementary calculus
References and Answers to Problems: App. J Part E, App. 2
19.1
Introduction
Numeric methods are used to solve problems on computers or calculators by numeric
calculations, resulting in a table of numbers andlor graphical representations (figures). The
steps from a given situation (in engineering, economics. etc.) to the final answer are usually
as follows.
1. Modeling. We set up a mathematical model of our problem. such as an integral, a
system of equations. or a differential equation.
2. Choosing a numeric method and parameters (e.g., step size), perhaps with a
preliminary error estimation.
3. Programming. We u~e the algorithm to write a corresponding program in a CAS,
such as Maple, Mathematica, Matlab, or Mathcad, or, say, in Fortran, C, or C++,
selecting suitable routines from a software system as needed.
4. Doing the computation.
S. Interpreting the results in physical or other terms, also deciding to rerun if further
results are needed.
780
SEC. 19.1
Introduction
781
Steps and 2 are related. A slight change of the model may often admit of a more
efficient method. To choose methods, we must first get to know them. Chapters 19- 21
contain efficient algorithms for the most important classes of problems OcculTing
frequently in practice.
In Step 3 the program consists of the given data and a sequence of instructions to be
executed by the computer in a certain order for producing the answer in numeric or graphic
form.
To create a good understanding of the nature of numeric work, we continue in this
section with some simple general remarks.
Floating-Point Form of Numbers
We know that in decimal notation, every real number is represented by a finite or an
infinite sequence of decimal digits. Now most computers have two ways of representing
numbers, called fixed poillf and floating point. In a fixed-point system all numbers are
given with a fixed number of decimals after the decimal point; for example, numbers
given with 3 decimals are 62.358, 0.014, 1.000. In a text we would write, say, 3 decimals
as 3D. Fixed-point representations are impractical in most scientific computations because
of their limited range (explain!) and will not concern us.
In a floating-point system we write. for instance.
0.1735' 10- 13 ,
-0.2000' 10- 1
1.735' 10- 14,
- 2.000' 10- 2 .
or sometimes also
We see that in this system the number of significant digits is kept fixed, whereas the
decimal point is "t1oating." Here, a significant digit of a number c is any given digit of
c, except possibly for zeros to the left of the first nonzero digit; these zeros serve only to
fix the position of the decimal point. (Thus any other zero is a significant digit of c.) For
instance, each of the numbers
1360,
1.360,
0.001360
has 4 significant digits. In a text we indicate, say, 4 significant digits, by 4S.
The use of exponents permits us to represent very large and very small numbers. Indeed.
theoretically any nonzero number a can be written as
0.1 ~
(I)
Iml <
1,
11
integer.
On the computer, 111 is limited to k digits (e.g., k = 8) and n is limited. giving representations
(for finitely many numbers only!)
(2)
a = :!:.m· 10",
These numbers a are often called k-digit decimal macbine numbers. Their fractional part
m (or Iii) is called the mamis.m. This has nothing to do with "mantissa" as used for
logarithms. 11 is called the exponent of a.
782
CHAP. 19
Numerics in General
Underflow and Overflow. The range of exponents that a typical computer can handle
is very large. The IEEE (Institute of Electrical and Electronics Engineers) floating-point
standard for single precision (the usual number of digits in calculations) is about
- 38 < 11 < 38 (about - 125 < n* < 125 for the exponent in binary representations,
i.e., representations in base 2). [For so-called double precision it is about - 308 < n < 308
(about -1020 < n* < 1020 for binary).] If in a computation a number outside that range
occurs, this is called underflow when the number is smaller and overflow when it is
larger. In the case of underflow the result is usually set to zero and computation continues.
Overflow causes the computer to halt. Standard codes (by IMSL, NAG, etc.) are written
to avoid overflow. Error messages on overflow may then indicate progran1ming errors
(incorrect input data, etc.).
Roundoff
An error is caused by chopping (= discarding all decimals from some decimal on) Or
rounding. This error is called roundoff error, regardless of whether we chop or round.
The rule for rounding off a number to k decimals is as follows. (The rule for rounding
off to k significant digits is the same, with "decimal" replaced by "significant digit.")
Roundoff Rule. Discard the (k + 1)th and all subsequent decimals. (a) If the number
thus discarded is less than half a unit in the Ath place, leave the kth decimal unchanged
(" rounding down"). (b) If it is greater than half a unit in the A1h place, add one to the kth
decimal ("rounding lip"). (c) If it is exactly half a unit, round off to the nearest el'en
decimal. (Example: Rounding off 3.45 and 3.55 to I decimal gives 3.4 and 3.6,
respectively.)
The last part of the rule is supposed to ensure that in discarding exactly half a decimal,
rounding up and rounding down happens about equally often, on the average.
If we round off 1.2535 to 3, 2, I decimals, we get 1.254. 1.25. 1.3, but if 1.25 is rounded
off to one decimal, without fmther information, we get 1.2.
Chopping is not recommended because the corresponding error can be larger than that
in rounding, and is systematic. (Nevertheless, some computers use it because it is simpler
and faster. On the other hand, some compurers and calculators improve accuracy of results
by doing intermediate calculations using one or more extra digits, called guarding digits.)
Error in Rounding. Let a = fl(a) in (2) be the floating-point computer approximation
of a in (I) obtained by rounding. where fl suggests floating. Then the roundoff rule gives
(by dropping exponents) 1111 - ilil ~ ~. lO- k . Since Iml ~ 0.\, this implies (when a *- 0)
(3)
II - a
a
I
I
1111 - iii
11/
I
~
I
l-k
_.\0
2
The right side u = ~. 10 1 - k is called the rounding unit. If we write a = ll( I + 8), we
have by algebra (li - a)/a = 8, hence 181 ~ 1I by (3), This sholl"s that the rounding l/17it
u is all error bOl/nd in rounding.
Rounding errors may ruin a computation completely, even a small computation. In
general, these errors become the more dangerous the more arithmetic operations (perhaps
several millions!) we have to perform, It is therefore important to analyze computational
programs for expected rounding eITors and to find an arrangement of the computations
such that the effect of rounding errors is as small as possible.
The arithmetic in a computer is not exact either and causes further errors; however,
these will not be relevant to our discussion.
SEC. 19.1
783
Introduction
Accuracy in Tables. Although available software has rendered various tables of function
values superfluous, some tables (of higher functions, of coefficients of integration
formulas, etc.) will still remain in occasional use. [f a table shows k significant digits, it
is conventionally assumed that any value a in the table deviates from the exact value a
by at most ±! unit of the kth digit.
Algorithm. Stability
Numeric methods can be formulated as algorithms. An algorithm is a step-by-step
procedure that states a numeric method in a form (a "pseudocode") understandable to
humans. (Turn pages to see what algorithms look like.) The algorithm is then used to
write a program in a programming language that the computer can understand so that it
can execute the numeric method. Important algorithms follow in the next sections. For
routine tasks your CAS or some other software system may contain programs that you
can use or include as pat1s of larger programs of your own.
Stability. To be useful, an algorithm should be stable; that is, small changes in the
initial data should cause only small changes in the final results. However, if small changes
in the initial data can produce large changes in the final results, we call the algorithm
unstable.
This "numeric instability," which in most cases can be avoided by choosing a better
algorithm, must be distinguished from "mathematical instability" of a problem, which is
called "ill-conditiolling, " a concept we discuss in the next section.
Some algorithms are stable only for certain initial data, so that one must be careful in
such a case.
Errors of Numeric Results
Final results of computations of unknown quantities generally are approximations; that
is, they are not exact but involve errors. Such an error may result from a combination of
the following effects. Roundoff errors result from rounding, as discussed on p. 782.
Experimental errors are en-ors of given data (probably arising from measurements).
Truncating errors result from truncating (prematurely breaking off), for instance, if we
replace a Taylor series with the sum of its first few terms. These errors depend on the
computational method used and must be dealt with individually for each method.
["Truncating" is sometimes used as a term for chopping off (see before), a terminology
that is not recommended.]
Formulas for Errors. If a is at1 approximate value of a quantity whose exact value is
a, we call the difference
(4)
E=a-a
the error of a. Hence
(4*)
a
=
a + E,
True value
= Approximation + Error.
For instance, if a = 10.5 is an approximation of a = 10.], its error is
error of an approximation a = 1.60 of a = 1.82 is E = 0.22.
E
= -0.3. The
CHAP. 19
784
Numerics in General
CAUTION! In the literature la - al ("ab~olute error") or
used as definitions of error.
The relative error E,. of is defined by
a-
a are sometimes also
a
(5)
Er
=
E
a-a
Error
a
a
True value
(a =f:. 0).
This looks useless because a is unknown. But if Itl is much less than
a instead of a and get
lal. then we can use
(5')
This still looks problematic hecause E is unknown-if it were known, we could get
a = a + E from (4) and we would be done. But what one often can ohtain in practice is
an error bound for a, that is, a number f3 such that
Itl
~
f3,
la - al
hence
~
f3.
This tells us how far away from our computed a the unknown a can at most lie. Similarly.
for the relative error, an en-or bound is a number f3r such that
a - a
--a-
hence
I
I
~
f3r·
Error Propagation
This is an imp0l1ant matter. It refers to how en-ors at the beginning and in later steps
(roundoff, for example) propagate into the computation and affect accuracy, sometimes
very drastically. We state here what happens to en-or bounds. Namely, bounds for the
error add under addition and subtraction, whereas bounds for the relative error add under
multiplication and division. You do well to keep this in mind.
THEOREM 1
Error Propagation
(a) In addition and sllbtraction, an error bound for the results is given by the
sum of the error bounds for the terms.
(b) In multiplication and division, all error bound for the relative error of the
results is given (approximately) by the sum of the bounds for the relative errors
of the gil'en llumbers.
PROOF
(a) We use the notations x =
E of the difference we obtain
Itl
=
=
x + EI' Y = Y+ E2, IEII ~ f31.IE21 ~ f32' Then for the en-or
Ix Ix -
y - (x -
y)1
x - (y -
y)1
= h - E21
~
hi
+
k21
~ f31 + f32'
SEC. 19.1
785
Introduction
The proof for the sum is similar and is left to the student.
(b) For the relative error
and bounds f3rI, f31'2
IErI
=
Er
of xy we get from the relative errors
Ixy - xy I Ixy =
xy
(x - EI)(Y - E2 )
xY
Erl
I IElY + E
2X -
=
and
Er2
of X,
y
EI E21
x)'
This proof shows what "approximately" means: we neglected EI E2 as small in absolute
value compared to lEI I and IE21. The proof for the quotient is similar but slightly more
tricky (see Prob. 15).
•
Basic Error Principle
Every numeric method should be accompanied by an error estimate. [f such a formula is
lacking, is extremely complicated, or is impractical because it involves information (for
instance, on derivatives) that is not available, the following may help.
Error Estimation by Comparison. Do a calculation twice with different accuracy.
Regard the difference a2 - al of the results aI, a2 as a (perhaps en/de) estimate of the
error EI of the inferior result al' Indeed, al + EI = a2 + E2 by formula (4*). This implies
a2 - al = EI - E2 = EI because a2 is generally more accurate than aI, so that IE21 is
small compared to IEII.
Loss of Significant Digits
This means that a result of a calculation has fewer correct digits than the numbers from
which it was obtained. This happens if we subtract two numbers of about the same size,
for exan1ple. 0.1439 - 0.1426 ("subtractive cancellation"). It may occur in simple
problems, but it can be avoided in most cases by simple changes of the algorithm-if one
is aware of it! Let us illustrate this with the following basic problem.
E X AMP L E 1
Quadratic Equation. Loss of Significant Digits
Find the roots of the equation
X2 -
40x
+2
=
0,
using 4 significant digits (abbreviated 4S) in the computation.
Solution. A formula for the roots Xl' -"2 of a quadratic equation ax2 +
bx
+c
=
0 is
(6)
Furthermore, since xlx2
(7)
=
cIa, another formula for those roots is
Xl a~
before.
Fro~ (6) we obtain X = 20 :+: V398 = ~O.OO :+: 19.95. This gives Xl = 20.00 + 19.95 = 39.95, involving no
dIffIculty, whereas X2 = 20.00 - 19.95 = 0.05 is poor because it involves loss of significant digits.
In contrast, .(7) gives Xl = 39.?5, X2 = 2.000/39.95 = 0.05006, in error by less than one unit of the last digit,
as a computatIon Wlth more digIts shows. (The lOS-value is 0.05006265674.)
786
CHAP. 19
Numerics in General
Comment. To avoid misunderstandings: 4S was used for convenience; (7) is beller than (6) regardless of
the number of digits used. For instance. the 8S-computation by (6) is xl = 39.949937. X2 = 0.050063. which
i~ poor. and by (7) it is Xl as before. X2 = 2/.\"1 = 0.050062657.
In a quadratic equation with real root,. if X2 is absolutely largest (because b > OJ. use (6) for X2 dnd then
xl =
-_ _-...........
c!(m'2)'
...
•
....
- •••.. - -.... - .. y-~ . . .-.
•• - .
_".
1. (Floating point) Write 98.17, -100.987, 0.0057869,
- 13600 in floating-point form, rounded to 4S (4
significant digits).
2. Write -0.0286403. 11.25845. - 3168\.55 in f1oatingpoint form rounded to 6S.
3. Small differences of large numbers may be
particularly strongly affected by rounding errors.
l\Iu~trate this by computing 0.36443/(17.862 - 17.798)
as given with 5S. then rounding stepwise to 4S. 3S.
and 2S. where "stepwise" means: round the rounded
numbers. not the given ones.
4. Do the work in Prob. 3 with numbers of your choice
that give even more drastically different results. How
can you avoid such difficulties?
5. The quotient in Prob. 3 is of the form a/(b - c). Write
it as alb + c:)/(b 2 - c 2 ). Compute it first with 5S, then
rounding numerator 12.996 and denominator 2.28
stepwise as in Prob. 3. Compare and comment.
6. (Quadratic equation) Solve.\"2 - 20x + I = 0 by (6)
and by (7). using 6S in the computation. Compare and
comment.
7. Do the computations in Prob. 6 with 4S and 2S.
8. Solve.\"2 + 100x + 2 = 0 by (6) and (7) with 5S and
compare.
9. Calculate lIe = 0.367879 (6S) from the partial sums
of 5 to 10 terms of the Maclaurin series (a) of e-" with
x = 1, (b) of eX with x = 1 and then taking the
reciprocal. Which is more accurate?
10. Addition with a fixed number of significant digits
depends on the order in which you add the numbers.
Illustrate this with an example. Find an empirical rule
for the best order.
11. Approximations of 7T = 3.141 592 653 589 79 ...
are 22/7 and 355/1 13. Determine the corresponding
errors and relative errors to 3 significant digits.
12. Compute 7T by Machin's approximation
16 arctan ( 115) - 4 arctan (1/239) to lOS (which are
correct). (In 19X6, D. H. Bailey computed almost
30 million decimals of 7T on a CRA Y-2 in less than
30 hours. The race for more and more decimals is
continuing. )
13. (Rounding and adding) Let al' . . . , an be numbers
with aj correctly rounded to Dj decimals. In calculating
the sum (II + ... + (In' retmmng D = min DJ
decimals, is it essential that we first add and then round
the result or that we first round each number to D
decimals and then add?
14. (Theorems on errors) Prove Theorem I(a) for
addition.
15. Prove Theorem I(b) for division.
16. Show that in Example 1 the absolute value of the error
of X2 = 2.000/39.95 = 0.05006 is less than 0.00001.
17. Overflow and underflow can sometimes be avoided
by simple changes in a formula. Explain this in terms
of V.~ + y2 =
I + (ylx)2 with x 2 ~ y2 and x so
large that x 2 would cause ovelt10w. Invent examples
of your own.
18. (Nested form) Evaluate
I(x) = x 3 - 7.5x 2 + 1l.2x + 2.8
xV
= «x - 7.5)x
+ 11.2)x + 2.8
at x = 3.94 using 3S arithmetic and rounding. in both
of the given forms. The latter, called the nestedfamz,
is usually preferable since it minimizes the number of
operations and thus the effect of rounding.
19. CAS EXPERIMENT. Chopping and Rounding.
(a) Let x = 4/7 andy = 113. Find the error~ Echop, Eround
and the relative errors Er.ch, Er.rd of x + y. x - y. xy. x I)"
in chopping and rounding to 5S. Experiment with other
fractions of your choice.
(b) Graph Echop and Eround (for 5S) of k121 as a
function of k = I, 2, . , . , 21 on common axes. What
average value can you read from the graph for Echop?
For Eround? Experiment with other integers that give
similar graphs. Different types of graphs. Can you
characteri7e the ditTerent types in terms of prime
factors?
(c) How does the situation in (b) change if you take
4S instead of 5S?
(d) Write programs for the work in (a)-(c).
20. WRITI.'JG PROJECT. Numerics. In your own words
write about the overall role of numeric methods in
applied mathematics, wby they are impol1ant, where
and when they must be used or can be used, and how
they are influenced by the use of the computer in
engineering and other work.
SEC. 19.2
19.2
787
Solution of Equations by Iteration
by Iteration
Solution of Equations
From here on. each ~ection will be devoted to some basic kind of problem and
corresponding solution methods. We begin with methods of finding solutions of a single
equation
(1)
f(x)
=
0
where f is a given function. For this task there are practically no formulas (except in a
few :,imple cases). so that one depends almost entirely on numeric algorithms. A solution
of (1) is a number x = s such that f(s) = O. Here. s suggests "solution." but we shall also
use other letters.
Examples are x 3 + x = I, sin x = O.5x. tan x = x, cosh x = sec x, cosh x cos x = -1,
which can all be written in the form (1). The first concerns an algebraic equation because
the corresponding f is a polynomial, and in this case the solutions are also called roots
of the equation. The other equations are transcendental equations because they involve
transcendental functions. Solving equations (1) is a task of prime importance because
engineering applications abound: some occur in Chaps. 2.4. 8 (characteristic equations),
6 (partial fractions), 12 (eigenvalues. zeros of Bessel functions). and 16 (integration). but
there are many, many others.
To solve (1) when there is no formula for the exact solution. we can use an
approximation method. in plli1icular an iteration method, that is, a method in which we
start from an initial guess .1'0 (which may be poor) and compute step by step (in general
better and better) approximations Xl' X2' ... of an unknown solution of (1). We discuss
three such methods that are of particular practical importance and mention two others in
the problem set. These methods and the underlying principles are basic for understanding
the diverse methods in software packages.
In general, iteration methods are easy to program because the computational operations
are the smne in each step-just the data change from step to step--and. more important.
if in a concrete case a method converges. it is stable (see Sec. 19.1) in generaL
Fixed-Point Iteration for Solving Equations
f{x) = 0
Our present use of the word "fixed point"" has absolutely nothing to do with that in the
last section.
In one way or another we transform (1) algebraically into the form
(2)
Then we choose an Xo and compute
(3)
Xl
x
=
=
g(xo), X2
g(x).
=
g(Xl)' and in general
(11 = 0, 1, .. ').
A solution of (2) is called a fixed point of g, motivating the name of the method. This is a
solution of 0). since from X = g(x) we can return to the original form f(x) = O. From (I)
we may get several different forms of (2). The behavior of corresponding iterative sequences
Xo. x l ' . . . may differ, in particular. with respect to their speed of convergence. Indeed, some
of them may not converge at all. Let us illustrate these facts with a simple example.
788
CHAP. 19
Numerics in General
E X AMP L ElAn Iteration Process (Fixed-Point Iteration)
Set up an iteration process for the equation f(x) = x 2
x = 1.5 ±
vT.25.
thus
-
3x
+I
O. Since we know the solutions
=
and
2.618034
0.381966.
we can watch the behavior of the error as the iteration proceeds.
Solution.
The equation may be written
thu~
(4a)
Xn+l =
~(X71 2 +
I).
If we choose Xo = I, we obtain the sequence (Fig. -l23a: computed with 6S and then rounded)
Xo = 1.000,
Xl
= 0.667,
X2 =
0.481,
X3 =
which seems to approach the smaller solution. If we choose Xo
~o = 3, we obtain the sequence (Fig. 423a, upper part)
Xo = 3.000.
Xl
= 3.333.
X2
=
4.037.
X3
=
0.411.
X4
0.390,"
=
.
2, the situation is similar. If we choose
= 5.766.
=
X4
11.415 •...
which diverges.
Our equation may also be written (divide by x)
thu~
(4b)
xn+l =
3-
and if we choose Xo = I, we obtain the sequence (Fig. 423b)
Xo = 1.000,
Xl =
2.000,
X2
= 2.500,
X3
= 2.600,
X4
2.615,· ..
=
which seems to approach the larger solution. Similarly, if we choose Xo = 3, we obtain the sequence (Fig. 423b)
Xo
=
3.000,
Xl =
2.667,
X2 =
2.625,
X3 =
2.619,
X4
=
2.618,· ".
Our figures show the following. In the lower part of Fig. 423a the slope of gl(x) is less than the slope of
< I, and we seem to have convergence. In the upper part, gl(x) is steeper
(g~(x) > I) and we have divergence. In Fig. 423b the slope of g2lx) is less near the intersection point Ix = 2.618,
fixed point of g2, solution of f(x) = 0), and both sequences seem to converge. From all this we conclude that
convergence seems to depend on the fact that in a neighborhood of a solution the curve of g(x) is less steep
than the straight line y = x. and we shall now see that thi~ condition Ig' (x)1 < I (= slope of y = x) is sufficient
for convergence.
•
y = x, which is I, thus Ig~(x)1
5
5
~/
/
c.l
/
x
(a)
Fig. 423.
5
0
0
x
(h)
Example 1, iterations (4a) and (4b)
g2(x)
5
SEC. 19.2
789
Solution of Equations by Iteration
An iteration process defined by (3) is called convergent for an Xo if the corresponding
sequence xo, Xb ••• is convergent.
A sufficient condition for convergence is given in the following theorem, which has
various practical applications.
THEOREM 1
Convergence of Fixed-Point Iteration
Let x = s be a solution of x = g(x) and suppose that g has a continuous derivative
in some interval J containing s. Then !f Ig' (x) I ~ K < 1 in J, the iteration process
defined by (3) converges jor any Xo in J. and the limit of the sequence {xn} is s.
PROOF
By the mean value theorem of differential calculus there is a t between x and s such that
g(x) - g(s)
Since g(s) = s and Xl = g(xo),
Ig' (x) I in the theorem
X2
=
g'(t) (x - s)
(x
in J).
= g(Xl)' ... , we obtain from this and the condition on
Applying this inequality n times, for n, n - 1, ... , 1 gives
Since K < I, we have K n -----? 0; hence IXn
sl-----? 0 as n -----?
-
•
00.
We mention that a function g satisfying the condition in Theorem 1 is called a contraction
because Ig(x) - g(v)1 ~ Klx - vi, where K < 1. Furthermore, K gives information on the
speed of convergence. For instance, if K = 0.5, then the accuracy increases by at least
2 digits in only 7 steps because 0.5 7 < 0.01.
E X AMP L E 2
An Iteration Process. Illustration of Theorem 1
+ x-I ~ a by iteration.
Find a solution of I(x) = ..3
Solution.
A sketch shows that a solution lies near x = 1. We may write the equation as (x 2
1
x = gl(x) = - - - 2 .
1
+
SO
x
that
xn+ 1 =
---2 .
1 +xn
+
l)x = lor
Also
for any x because 4x2 /(1 + x 2 )4 = 4x 2 /(1 + 4x 2 + ... ) < 1, so that by Theorem 1 we have convergence for
any Xo. Choosing Xo = 1. we obtaio (Fig. 424 on p. 790)
Xl ~
0.500,
X2
= 0.800,
X3
=
0.610,
X4
= 0.729,
X5
= 0.653,
X6
= 0.701, ....
The solution exact to 6D is s = 0.682 328.
The given equation may also be written
Then
and this is greater than 1 near the solution, so that we cannot apply Theorem I and as~ert convergence. Try
= 0.5. Xo = 2 and see what happens.
,The example shows that the transformation of a given i(x) = a into the form x = g(x) with g satisfying
Ig (x)1 3 K < 1 may need some experimentation.
•
Xo = 1. Xo
790
CHAP. 19
Numerics in General
1.0
0.5
x
Fig. 424.
Iteration in Example 2
Newton's Method for Solving Equations t(x) = 0
Newton's method, also known as Newton-Raphson's method,! is another iteration
method for solving equations f(x) = 0, where f is assumed to have a continuous derivative
f'. The method is commonly used because of its simplicity and great speed. The underlying
idea is that we approximate the graph of f by suitable tangents. Using an approximate
value Xo obtained from the graph of f, we let Xl be the point of intersection of the x-axis
and the tangent to the curve of f at Xo (see Fig. 425). Then
hence
In the second step we compute X2 = Xl - f(xIV!' (Xl), in the third step X3 from X2 again
by the same formula, and so on. We thus have the algorithm shown in Table 19.1. Formula
(5) in this algorithm can also be obtained if we algebraically solve Taylor's formula
(5*)
y
x
Fig. 425.
Newton's method
IJOSEPH RAPHSON (]648-l715), English mathematician who published a method similar to Newton's
method. For historical details. see Ref. [GR2], p. 203. listed in App. I.
SEC. 19.2
791
Solution of Equations by Iteration
Table 19.1
Newton's Method for Solving Equations I(x)
ALGORITHM NEWTON (j,
t' , XO,
E,
=0
N)
This algorithm computes a solution of f(x) = 0 given an initial approximation Xo (starting
value of the iteration). Here the function f(x) is continuous and has a continuous
derivative f' tx).
f, f', initial approximation xo, tolerance
INPUT:
€
> O. maximum number of
iterations N.
OUTPUT:
For n
Approximate solution
Xn
(n ~ N) or message of failure.
0, I, 2, ... , N - 1 do:
=
Compute
t' (x,,).
If f' (xn) = 0 then OUTPUT "Failure". Stop.
2
[Procedure completed unsuccessfully]
Else compute
3
(5)
4
If
IXn+l -
xnl ~
Elx,J then OUTPUT xn +
1.
Stop.
[Procedure completed sllccessfullv]
End
5
OUTPUT "Failure". Stop.
[Procedure completed unsuccessfully after N iterations]
End NEWTON
If it happens that t' (x n ) = 0 for some n (see line 2 of the algorithm), then try another
statting value Xo. Line 3 is the heart of Newton's method.
The inequality in line 4 is a termination criterion. [f the sequence of the Xn converges
and the criterion holds, we have reached the desired accuracy and stop. In this line the
factor IXnl is needed in the case of zeros of very small (or very large) absolute value
because of the high density (or of the scarcity) of machine numbers for those x.
WARNING! The criterion by itself does not imply convergence. Example. The harmonic
series diverges, although its partial sums Xn = Lk~l 11k satisfy the criterion because
lim (Xn+l - xn) = lim (1/(n + 1) = O.
Line 5 gives another termination criterion and is needed because Newton's method may
diverge or, due to a POOT choice of xo, may not reach the desired accuracy by a reasonable
number of iterations. Then we may try another Xo. If I(x) = 0 has more than one solution,
different choices of Xo may produce different solutions. Also, an iterative sequence may
sometimes converge to a solution different from the expected one.
792
E X AMP L E 3
CHAP. 19
Numerics in General
Square Root
Set up a Newton iteration for computing the square root x of a given positive number c and apply it to c = 2.
Solutioll.
We have x =
Vc. hence f(x) =
r2 - c
= O.
f' (x)
=
X3 =
1.414216,
2r. and (5) takes the form
For c = 2, choosing '\0 = I. we obtain
Xl
X4
E X AMP L E 4
= 1.500000,
=
X2
1.416667,
X4 =
1.414214....
•
is exact to 6D.
Iteration for a Transcendental Equation
Find the positive solution of 2 sin x = x.
Solution.
=
Setting f(x)
xn+l
r - 2 sin x. we have
=
Xn -
Xn -
f' (x) =
2 sinxn
I - 2 cos x. and (5) gives
2( sin xn - xn cos xn)
1- 2cosxn
1- 2cosxn
From the graph of f we conclude that the solution is near Xo = 2. We compute:
I:
1
2
3
X4 =
E X AMP L E 5
2.00000
1.90100
1.89552
1.89550
1.83229
1.64847
1.63809
1.63806
3.48318
3.12470
3.10500
3.10493
1.90100
1.89552
1.89550
1.89549
1.89549 is exact to 5D since the solution to 6D is
1.~95
•
494.
Newton's Method Applied to an Algebraic Equation
Apply Newton's method to the equation f(x) = x3
Solution.
+X-
I =
o.
From (5) we have
Starting from Xo = 1. we obtain
Xl =
0.750000,
X2
= 0.686047,
X3 =
0.682 340,
X4 =
0.682 328, ...
where .'4 ha~ the error -I . 10- . A comparison with Example 2 shows that the present convergence is much
•
more rapid. This may motivate the concept of the order of all iteration process, to be discussed next.
6
Order of an Iteration Method. Speed of Convergence
The quality of an iteration method may be characterized by the speed of convergence, as
follows.
Let xn+ 1
=
= g(x'Y!)
g(x). Then
define an iteration method, and let Xn approximate a solution s of
=
S - En' where En is the error of X n . Suppose that g is differentiable
a number of times, so that the Taylor formula gives
x
Xn
Xn+l
(6)
=
g(xn )
=
g(s)
= g(s)
+ g'(s)(xn
- g'(S)En
- s)
+ h"(s)(xn
+ !g"(S)En 2 +
-
S)2
+
SEC. 19.2
793
Solution of Equations by Iteration
The exponent of En in the first oonvanishing term after g(s) is called the order of the
iteration process defined by g. The order measures the speed of convergence.
To see this, subtract g(s) = s on both sides of (6). Then on the left you get
Xn+l - S = -E,,+l, where E,,+1 is the error of Xn+l' And on the right the remaining
expression equals approximately its first nonzero term because IEnl is small in the case of
convergence. Thus
(a)
in the case of first order,
E,,+l=+g'(S)En
(7)
in the case of second order,
etc.
Thus if En = lO-k in some step, then for second order. En+l = C· (l0-k)2 = c· 1O-2k,
so that the number of significant digits is about doubled in each step.
Convergence of Newton's Method
In Newton's method, g(x)
=
x - f(x)/f' (x). By differentiation,
g'(x) = 1 -
f' (X)2
-
f
(8)
f(.T)f"(X)
,
2
(x)
f(x)f"(x}
f' (X)2
Since f(s) = 0, this shows that also g'(s) = O. Hence Newton's method is at least of
second order. If we differentiate again and set x = s, we find that
(8*)
g"(S) =
f"(S)
f'(s)
which will oot be zero in general. This proves
THEOREM 2
Second-Order Convergence of Newton's Method
if f(x) is three times differentiable and f' and f" are not z.ero at a solution s of
f(x)
=
0, then for Xo sufficiently close to s, Newton's method is of second order.
Comments.
For Newton's method, (7b) becomes, by (8*),
For the rapid convergence of the method indicated in Theorem 2 it is important that s be
a simple zero of f(x) (thUS (s) =I=- 0) and that Xo be close to s, because in Taylor's formula
we took only the linear term [see (5*)], assuming the quadratic term to be negligibly small.
(With a bad Xo the method may even diverge!)
f'
794
E X AMP L E 6
CHAP. 19
Numerics in General
Prior Error Estimate of the Number of Newton Iteration Steps
Use .1"0 = 2 and xl = I.YUl 111 Example 4 for estimating how many Iteration steps we need to produce the
solution to 5D accuracy. This is an a priori estimate or prior estimate because we can compute it after only
one iteration, prior to further iterations.
Solution.
We have I(x} =
x-
2 sin X = O. Differentiation gives
{'(s)
{'(Xl)
2 sin x]
= - - - = ---.....:.--- =
,
2J (s)
2I' (.~I)
0.57.
Hence (Y) gives
1"n+ll
= 0.57"n2 = 0.57(0.57"~_IJ2 =
where M = 2n + 2 n condition becomes
I
+ ... + 2 + I = 2n+I -
0.573"~t_l
= ... = 0.57M,,~+I ;;: 5' 10-6
I. We show below that Eo = -0.11. Consequently. our
Hence II = 2 is the smallest possible n. according to this cnlde estimate. in good agrecment with Example 4.
"0 = -0.11 is obtained from "1 - Eo = ("1 - s) - (Eo - s) = -Xl + Xo = 0.10. hence
2
2
"1 = "0 + 0.10 = -0.57"0 or 0.57"0 + "0 + 0.10 = O. which gives "0 = -0.11.
•
Difficulties may arise if It' (x) I is very small near a
solution s of f(x) = o. for instance, if s is a zero of f(x) of second (or higher) order (so
that Newton's method converges only linearly, as an application of I'H6pital's rule to
(8) shows). Geometrically, small It' (x) I means that the tangent of f(x) near s almost
coincides with the x-axis (so that double precision may be needed to get f(x) and f' (x)
accurately enough). Then for values x = far away from s we can still have small function
values
Difficulties in Newton's Method.
s
R(s)
= fCS).
In this case we call the equation f(x) = 0 ill-conditioned. R(s) is called the residual of
f(x) = 0 at S. Thus a small residual guarantees a small error of only if the equation is
not iII-conditioned.
s
E X AMP L E 7
An Ill-Conditioned Equation
J(x) = X5 + 1O- 4.t = 0 is ill·conditioned. x = 0 is a solution . .t' (0) = 10-4 is small. At s = 0.1 the residual
J(O.I) = 2· 10- 5 is small, but the en-or -0.1 is laIger in absolute value by a factor 5000. Invent a more drastic
example of your own.
•
Secant Method for Solving {(x)
=
0
Newton's method is very powerful but has the disadvantage that the derivative f' may
sometimes be a far more difficult expression than f itself and its evaluation therefore
computationally expensive. This situation suggests the idea of replacing the derivative
with the difference quotient
t' (xn ) =
f(x n )
-
Xn -
f(x n -
1)
Xn - 1
Then instead of (5) we have the formula of the popular secant method
SEC. 19.2
Solution of Equations by Iteration
795
y
x
Fig. 426.
Secant method
(10)
Geometrically, we intersect the x-axis at -',,+1 with the secant of f(x) passing through
Pn - 1 and Pn in Fig. 426. We need two starting values Xo and xl' Evaluation of derivatives
is now avoided. [t can be shown that convergence is superlinear (that is, more rapid than
linear, IEn+11 = const 'IEnIL62; see [E5] in App. 1), almost quadratic like Newton's method.
The algorithm is similar to that of Newton's method, as the student may show.
CAUTION!
It is not good to write (0) as
xn-d(xn) - -'llf(Xn -1)
f(xn) - f(X n -1)
because this may lead to loss of significant digits if Xn and Xn-l are about equal. (Can
you see this from the formula?)
EXA M P L E 8
Secant Method
Find the positive solution of f(x) = x - 2 sin x = 0 by the secant method, starting from .Io = 2,
Solution.
Xl =
1.9.
Here, (10) is
Numerical values are:
n
X n -1
xn
Nn
D"
2
2.000000
1.900000
1.895747
1.900000
1.895747
1.895494
-0.000740
-0.000002
0
-0.174005
-0.006986
3
X3
= 1.895 494 is exact
to
6D. See Example 4.
-0.004253
-0.000252
o
•
Summary of Methods. The methods for computing solutions s of f(x) = 0 with given
continuous (or differentiable) f(x) start with an initial approximation Xo of s and generate
a sequence Xl, X2, . . . by iteration. Fixed point methods solve f(x) = 0 written as
x = g(x), so that s is a fixed point of g, that is, s = g(s). For g(x) = X - f(x)li' (X) this
is Newton's method, which for good Xo and simple zeros converges quadratically (and
for multiple zeros linearly). From Newton's method the secant method follows by
replacing f' (x) by a difference quotient. The bisection method and the method of false
position in Problem Set 19.2 always converge, but often slowly.
CHAP. 19
796
Numerics in General
...... - .
--11-71 FIXED-POINT ITERATION
Apply fixed-point iteration and answer related questions
where indicated. Show details of your work.
1. x = 1.4 sin x, Xo = 1.4
2. 00 the iterations indicated at the end of Example 2.
Sketch a figure similar to Fig. 424.
3. Why do we obtain a monotone sequence in Example
I, but not in Example 2?
f = X4 - X + 0.2 = 0, the root near I, Xo
5. f as in Prob. 4, the root near 0, Xo = 0
=
6. Find the smallest positive solution of sin x
e- 0 . 5X ,
4.
Xo =
=
I.
7. (Bessel functions, drumhead) A partial sum of the
Maclaurin series of JoCr) (Sec. 5.5) is
f(x) = I - !x 2 + i'4x4 - 23~4X6. Conclude from a
sketch that f(x) = 0 near x = 2. Write f(x) = 0 as
x = g(x) (by dividing f(x) by !x and taking the
resulting x-term to the other side). Find the zero. (See
Sec. 12.9 for the importance of these zeros.)
8. CAS PROJECT. Fixed-Point Iteration. (a) Existence.
Prove that if g is continuous in a closed interval f and its
range lies in f, then the equation x = g(x) has at least one
solution in f. Illustrate that it may have more than one
solution in f.
(b) Convergence. Let f(x) = x 3 + 2r2 - 3x - 4 = o.
Write this asx = g(x), for g choosing (1) (x 3 - /)113,
(2) (r2 - !n1l2, (3) x + !f, (4) x(l + !f),
(5) (x 3 - f)lx 2 , (6) (2x 2 - f)/2x, (7) x - flf'
and in each case Xo = 1.5. Find out about convergence
and divergence and the number of steps to reach exact
6S-values of a root.
19-181 NEWTON'S METHOD
Apply Newton's method (60 accuracy). First sketch the
function(s) to see what is going on.
~ ~nx
= cot x,
Xo
=
12. x
-
5x
+ In x
+ 3
=
2,
16. (Legendre polynomials) Find the largest root of the
Legendre polynomial P 5 (x) given by
5
P5(X) = k(63x - 70x 3 + 15x) (Sec 5.3) (to be
needed in Gauss integration in Sec. 19.5) (a) by
Newton's method, (b) from a quadratic equation.
17. Design a Newton iteration for cube roots and compute
~ (60, Xo = 2).
18. Design a Newton iteration for -\7c" (c > 0). Use it to
3-.4(::;- ,5(::;compute \12, V'2, v 2, v 2 (60, Xo = I).
19. TEAM PROJECT. Bisection Method. This simple
but slowly convergent method for finding a solution of
f(x) = 0 with continuous f is based on the
intermediate value theorem, which states that if a
continuous function f has opposite signs at some x = a
and x = b (> a). that is, either f(a) < 0, feb) > 0
or f(a) > 0, feb) < 0, then f must be 0 somewhere
on [a, b]. The solution is found by repeated bisection
of the interval and in each iteration picking that half
which also satisfies that sign condition.
(a) Algorithm. Write an algorithm for the method.
(b) Comparison. Solve x
= cosx by Newton's
method and by bisection. Compare.
(e) Solve e- x = In x and eX
+ x4 + X
2 by bisection.
=
20. TEAM PROJECT. Method of False Position
(Regula falsi). Figure 427 shows the idea. We assume
that f is continuous. We compute the x-intercept Co of
the line through (ao, f(ao), (b o , f(bo». If f(co) = 0,
we are done. If f(ao)f(c o ) < 0 (as in Fig. 427), we
set a 1 = ao, b 1 = Co and repeat to get Cl, etc.
If f(ao)ffco) > 0, then f(co)f(b o ) < 0 and we set
al = Co, b 1 = b o, etc.
(a) Algorithm. Show that
10. x = cos x, xo
11. x 3
15. (Associated Legendre functions) Find the smallest
positive zero of
2
P4 = (I - X2)p~ = ¥(-7x 4 + 8x 2 - 1) (Sec.
5.3) (a) by Newton's method, (b) exactly, by solving
a quadratic equation.
0, Xo = 2
Xo =
2
13. (Vibrating beam) Find the solution of cos x cosh x = I
near x = ~ 7T. (This determines a frequency of a vibrating
beam: see Problem Set 12.3.)
14. (Heating. cooling) At what time x (4S-accuracy only)
will the processes governed by fl(X) = 100(1 - e- O.2x )
and f2(X) = 4Oe- O.Olx reach the same temperature?
Also find the latter.
Co
=
aof(bo ) - bof(ao)
f(bo) - f(ao)
and write an algorithm for the method.
(b) Comparison. Solve x 3 = 5x + 6 by Newton's
method, the secant method, and the method of false
position. Compare.
(C) Solve X4 = 2, cos x = V~, and
by [he method of false position.
x
+
In
x
2
SEC. 19.3
797
Interpolation
y
121-241
~/
SECANT METHOD
Solve. using
~
y=f(x)
x
Xo
and
23. Prob. 9,
=
0.5
Xl
=
J
Fig. 427.
19.3
indicated.
= 2.0
= I. Xl
I.
Xo =
24. Prob. 10, Xo
/(
Xl i1~
21. Prob. ll, Xo = 0.5, Xl
22. e- x - tan X = o. Xo
=
Xl
0.5,
0.7
I
25. WRITING PROJECT. Solution of Equations.
Compare the methods in this section and problem set,
discussing advantages and disadvantages using
examples of your own.
Method of false position
Interpolation
Interpolation means finding (approximate) values of a function f(x) for an X between
Xl' • • . , xn at which the values of f(x) are given. These values may
come from a "mathematical" function, such as a logarithm or a Bessel function, or, perhaps
more frequently, they may be measured or automatically recorded values of an "empirical"
function, such as the air resistance of a car or an airplane at different speeds, or the yield
of a chemical process at different temperatures, or the size of the U.S. population as it
appears from censuses taken at IO-year intervals. We write these given values of a function
f in the form
different x-values xo,
fo
=
f(xo),
fn
= f(x,J
or as ordered pairs
A standard idea in interpolation now is to find a polynomial Pn(x) of degree n (or less)
that assumes the given values; thus
(1)
We call this Pn an interpolation polynomial and xo, ... , xn the nodes. And if f(x) is a
mathematical function, we call Pn an approximation of f (or a polynomial
approximation, because there are other kinds of approximations, as we shall see later).
We use Pn to get (approximate) values of f for x's between Xo and xn ("interpolation")
or sometimes outside this interval Xo ~ x ~ Xn ("extrapolation").
Motivation. Polynomials are convenient to work with because we can readily
differentiate and integrate them. again obtaining polynomials. Moreover, they approximate
continuous functions with any desired accuracy. That is. for any continuous f(x) on an
interval 1: a ~ x ~ b and error bound f3 > 0, there is a polynomial Pn(x) (of sufficiently
high degree n) such that
If(x) - Pn(X)
I < f3
for all x on 1.
This is the famous Weierstrass approximation theorem (for a proof see Ref. [GR7],
p. 280; see App. O.
798
CHAP. 19
Numerics in General
Existence and Uniqueness. Pn satisfying (I) for given data exists-we give formulas
for it below. Pn is unique. Indeed, if another polynomial qn also satisfies
lJn(.ro) = fo, ... , lJn(xn) = f n • then Pn(x) - lJn(x) = 0 at xo . ...• X n , but a polynomial
Pn - qn of degree 11 (or less) with 11 + I roots must be identically zero, as we know from
algebra; thus Pn(x) = q,,{x) for all x, which means uniqueness.
•
How to Find Pn? This is the important practical question. We answer it by explaining
several standard methods. For given data. these methods give the same polynomial. by
the uniqueness just proved (which is thus of practical interest!), but expressed in several
forms suitable for different purposes.
Lagrange Interpolation
Given (xo, f 0), (Xl' f 1). ' , , , (Xn , f n) with arbitrarily spaced Xj' Lagrange had the idea of
multiplying each f j by a polynomial that is I at Xj and 0 at the other 1l nodes and then
taking the sum of these 11 + 1 polynomials. Clearly, this gives the unique interpolation
polynomial of degree 11 or less. Beginning with the simplest case, let us see how this works.
Linear interpolation is interpolation by the straight line through (xo, fo), (Xl' II): see
Fig. 428. Thus the linear Lagrange polynomial PI is a sum PI = Lof0 + Ld 1 with Lo the
linear polynomial that is I at Xo and 0 at Xl; similarly, Ll is 0 at Xo and I at Xl' Obviously,
Lo(x)
x -
=
Xl -
Xo
Xo
This gives the linear Lagrange polynomial
(2)
. fo
+
Error~
~
/1"
/ f .o
p(x)
1
1--'"
}'=f(x)
f)
x
x
Fig. 428.
E X AMP L E 1
Linear interpolation
Linear Lagrange Interpolation
Compute a 4D-value of In \1.2 from In 9.0 = 2.1\172. In 9.5 = 2.2513 by linear Lagrange interpolation and
determine the error, llsing In \1.2 = 2.2192 (4Dl.
Solutioll.
Xo
= 9.0. Xl = 9.5.10
=
In 9.0. it
x - 9.5
Lo(x)
= --=0:5 = - 2.0(x
L 1 (x)
= ----0:5 = 2.0(x -
x - 9.0
= In 9.5. In (2) we need
- 1).5).
9.0).
Lo«).2)
= - 2.0( -0.3) = 0.6
L 1 (9.2) = 2· 0.2
=
0.4
(see Fig. 429) and obtain the answer
In 9.2 ~ Pl(9.2)
= L o(9.2)1 0 +
L 1 (9.2)1 1
= 0.6' 2.1972 +
0.4' 2.2513
= 2.2188.
SEC. 19.3
799
Interpolation
The error is E = a - a = 2.2192 - 2.2188 = OJlO04. Hence linear interpolation is not sufficient here to get
4D-accuracy; it would suffice for 3D-accuracy.
•
~~- --_-"'oX'L,
o
... --~I~I~'~I--~--~~
9 9.2 9.5
Fig. 429.
11 x
10
Lo and L, in Example 1
Quadratic interpolation is interpolation of given (xo, fo),
second-degree polynomial P2(X), which by Lagrange's idea is
(Xl, f1), (X2, f2)
by a
(3a)
Lo(x)
=
[ (x)
(x - Xl)(X - X2)
_0_
[o(xo)
(3b)
LI(x)
(xo -
[1 (x)
(x - Xo)(x -
[1(X1)
(Xl - XO)(XI -
= --
~(x) =
Xl)(XO -
(x - Xo)(X -
=
[2(X)
[2(X2)
X2)
X2)
X2)
Xl)
(X2 - XO)(X2 -
Xl)
How did we get this? Well, the numerator makes Lk(xj) = 0 ifj
makes Lk(xk) = 1 because it equals the numerator at X = Xk'
E X AMP L E 2
=1=
k. And the denominatOi
Quadratic Lagrange Interpolation
Compute In 9.2 by (3) from the data in Example I and the additional third value In 11.0
Solution.
=
2.3979.
In (3),
Lo(x) =
(x - 9.5)(x - 11.0)
(9.0 - 9.5)(9.0 - 11.0)
= x
2
-
(x - 9.0)(x - 1 1.0)
20.5x
I
= (9.5 _ 9.0)(9.5 _ 11.0) = - 0.75
~(x)
= (11.0 _ 9.0)(11.0 _ 9.5) =
1
"3
Lo(9.2) = 0.5400,
2
L 1(x)
(x - 9.0)(x - 9.5)
+ 104.5,
(x
+ 99),
L1(9.2)
= 0.4800,
+ 85.5),
~(9.2)
= -0.0200,
- 20x
2
(x
- 18.5x
(see Fig. 430), so that (3a) gives, exact to 4D,
In 9.2 = P2(9.2)
= 0.5400' 2.1972 + 0.4800' 2.2513
- 0.0200' 2.3979
x
Fig. 430.
Lo, L" L2 in Example 2
= 2.2192.
•
800
CHAP. 19
Numerics in General
General Lagrange Interpolation Polynomial.
For general n we obtain
(4a)
where Lk(Xk) = 1 and Lk is 0 at the other nodes. and the Lk are independent of the function
f to be interpolated. We get (4a) if we take
(4b)
lo(x)
=
(x - X1)(X - X2) ... (x - xn),
Ik(x)
=
(x - xo) ... (x - Xk-1 )(x -
In(x)
=
(x - xo)(x - xl) ... (x - X,,-l)'
Xk+1) ... (x -
0< k < n,
x n ),
We can easily see that P,,(Xk) = h. Indeed, inspection of (4b) shows that Ik(xj) = 0 if
=1= k, so that for x = Xk' the sum in (4a) reduces to the single tenn (lk(Xk)llk (Xk»fk = h.
j
Error Estimate. If f is itself a polynomial of degree n (or less), it must coincide with
p" because the n + 1 data (xo. fo), ... , (x11' fIt) determine a polynomial uniquely. so
the error is zero. Now the special f has its (n + 1)st derivative identically zero. This
makes it plausible that for a ~ene/"{t! fits (11 + l)st derivative tn+ll should measure the
error
En(X)
=
f(x) - p,,(x).
t
It can be shown that this is true if n + ll exists and is continuous. Then, with a suitable
t between Xo and x" (or between Xo. x n ' and x if we extrapolate),
f'n+l>(t)
(5)
E.,(.l) = f(x) - Pn(x) = (x - .lo)(x - Xl) .•• (X -
ern)
-=-------'-'--
(11
+
I)!
Thus !En(x)i is 0 at the nodes and small near them, because of continuity. The product
(x - Xo) ... (x - xu) is large for \" away from the nodes. This makes extrapolation risky.
And interpolation at an x will be best if we choose nodes on both sides of that x. Also,
we get error bounds by taking the smallest and the largest value of f'n+ll(t) in (5) on the
interval Xo ~ t ~ Xn (or on the interval also containing x if we extrapolate).
Most importantly, since Pn is unique, as we have shown, we have
THEOREM 1
Error of Interpolation
Formula (5) gil'es the error for allY po/molllia/ interpolation lIIet/lOd (f f(x) has a
continuous (11 + I )st deri\'{/tive.
Practical error estimate. If the derivative in (5) is difficult or impossible to obtain, apply
the Error Principle (Sec. 19.1), that is, take another node and the Lagrange polynomial
Pn+1(X) and regard PII+1(X) - Pn(x) as a (crude) error estimate for Pn(x),
SEC. 19.3
Interpolation
E X AMP L E 3
801
Error Estimate (5) of Linear Interpolation. Damage
by Roundoff. Error Principle
Estimate the error in Example I first by (5) directly and then by the Error Principle (Sec. 19.1).
Soluti01l.
(A)
Estimatioll by
(5),
We have
11
= 1.
fit)
= In 1. {(t) =
lIt.
{'(t) =
-2-
2t
Hence
0.03
(-1)
"l(x) = (x - 9.0){x - 9.5)
-1112.
thus
"1(9.2) =
-2- .
t
2
= 9.0 gives the maximum 0.03/9 = 0.00037 and f = 9.5 gives the minimum 0.03/9.5 2 = 0.00033.
SO that
we get 0.00033 ~ "1(9.2) ~ 0.00037, or better, 0.00038 because 0.3/81 = 0.003 703 ....
But the error 0.0004 in Example I disagrees, and we can learn something! Repetition of the computation there
with 5D instead of 4D gives
1
In 9.2 = 1'1(9.2)
= 0.6' 2.19722 + 0.4 . 2.25129 = 2.21885
with an actual en'or " = 2.21920 - 2.21885 = 0.00035. which lies nicely near the middle between our two
error bounds.
This shows that the discrepancy 10.0004 vs. 0.00035) was caused by rounding, which is not taken into account
in (5).
(B) EstinUltioll by the Error Pri1lciple. We calculate 1'1(9.2) = 2.21885 as before and then 1'2(9.2) as in
Example 2 but with 5D, obtaining
1'2(9.2)
= 0.54' 2.19722 + 0.48· 2.25129
- 0.02' 2.39790
= 2.21916.
The difference P2(9.2) - 1'1(9.2) = 0.00031 is the approximate error of 1'1(9.2) that we wanted to obtain: this
is an approximation of the actual en'or 0.00035 given above.
•
Newton's Divided Difference Interpolation
For given data (xo, f 0)' ... , (xn , f n) the interpolation polynomial Pn(x) satisfying (l) is
unique, as we have shown. But for different purposes we may use Pn(x) in different forms.
Lagrange's form just discussed is useful for deriving formulas in numeric differentiation
(approximation formulas for derivatives) and integration (Sec. 19.5).
Practically more imp0l1ant are Newton's forms of Pn(x), which we shall also use for solving
ODEs (in Sec. 21.2). They involve fewer arithmetic operations than Lagrange's fonn.
Moreover, it often happens that we have to increase the degree 11 to reach a required accuracy.
Then in Newton's forms we can use all the previous work and just add another term. a
possibility without counterpart for Lagrange's form. This also simplifies the application of
the Error Principle (used in Example 3 for Lagrange). The details of these ideas are as follows.
Let Pn-l(X) be the (n - l)st Newton polynomial (whose form we shall detelmine); thus
Pn-l(XO) = fO,Pn-1(X1) = f1,' .. ,Pn-1(Xn -l) = fn-l' FUl1hermore, let us write the nth
Newton polynomial as
(6)
hence
(6')
Here gn(x) is to be determined so that Pn(xO) = fo, Pn(Xl) = flo ... , Pn(xn ) = f n'
Since p" and Pn-l agree at xo, ... , X n -l' we see that gn is zero there. Also, gn will
generally be a polynomial of nth degree because so is Pn' whereas Pn-l can be of degree
n - I at most. Hence gn must be of the form
(6")
CHAP. 19
802
Numerics in General
We determine the constant an- For tIns we set x = Xn and solve (6") algebraically for
Replacing gn(xn ) according to (6') and using Pn(xn ) = In' we see that this gives
(In-
(7)
We write {lk instead of {In and show that (lk equals the kth divided difference, recursively
denoted and defined as follows:
and in general
(8)
PROOF
If II = 1, then Pn-l (xn ) = Po(xl )
of f(x) at Xo. Hence (7) gives
= Io because Po(x) is constant and equal to Io, the value
and (6) and (6") give the Newton interpolation polynomial of the first degree
If II
=
2, then this PI and (7) give
where the last equality follows by straightforward calculation and comparison with the
definition of the right side. (Verify it: be patient.) From (6) and (6") we thus obtain the
second Newton polynomial
For
II
= k, formula
Withpo(x)
=
(6) gives
fo by repeated application with k
=
I,· .. ,
II
this finally gives Newton's
divided difference interpolation formula
(10)
f(x) = fo
+ (x - xo)f[xo, XI] + (x - xo)(x - xI)flxo,
XI' X2]
+ ... + (x - xo)(x - XI) ... (x - xn-I)f[xo, ... , xn].
SEC. 19.3
803
Interpolation
An algorithm is shown in Table 19.2. The first do-loop computes the divided differences
and the second the desired value Pn(x).
Example 4 shows how to arrange differences near the values from which they are
obtained: the latter always stand a half-line above and a half-line below in the preceding
•
column. Such an arrangement is called a (divided) difference table.
Table 19.2
Newton's Divided Difference Interpolation
ALGORITHM INTERPOL txo, ... , Xn;
f 0,
••• ,
f n; x)
This algorithm computes an approximation p,,(.f) of f(.O at .f.
INPUT: Data (xo- f 0), (x b .fI) •...• (xno f n);
Approximation Pn(.x) of fCO
OUTPUT:
Set f[-':i1
=
For
I.. . ..
111 =
x
For j
U = 0, ... , n).
fj
11 -
I do:
0, ....
=
11 -
Tn
do:
End
End
Set Po(x)
For k
=
=
f o·
1, ... ,
11
do:
End
OUTPLT Pn(i)
End INTERPOL
E X AMP L E 4
Newton's Divided Difference Interpolation Formula
Compute f(9.2) from the values shown in the fust two columns of the following table.
Xj
fj
=
ft9
!to
(2.079442"
....
9.0
2.197225
f[Xj. Xj+l]
".
(0.117 78~
,----- - \-U.006 433
0.lO8134
9.5
2.251 292
-0.005200
0.097735
1l.0
2.397895
804
CHAP. 19
Numerics in General
Soluti01l. We compute the divided differences as shown. Sample computation:
(0.097735 - O.lO8 134)/(11 - 9) = -0.005200.
The values we need in (lO) are circled. We have
f(x) = P3(x) = 2.079442
+ 0.117783(,' - KO) - 0.006433(x - 8.0)(x - 9.0)
+ 0.000 4Il(x - 8.0)(x - 9.0)(x - 9.5).
Alx=9.2,
J(9.2) = 2.079442
+ 0.141 340 - 0.001544 - 0.000030
2.219208.
=
The value exact to 6D is J(9.2) = In 9.2 = 2.219203. Note that we can nicely see how the accuracy increases
from term to term:
Pl(9.2) = 2.220782.
P2(9.2) = 2.219 238,
•
P3(9.2) = 2.219208.
Equal Spacing: Newton's Forward Difference Formula
Newton's formula (10) is valid for arbitrarily spaced nodes as they may occur in practice
in expe1iments or observations. However, in many applications the x/s are regularly
spaced-for instance, in measurements taken at regular intervals of time. Then, denoting
the distance by 11, we can write
(11)
Xn
=
Xo
We show how (8) and (10) now simplify considerably!
To get started, let us define the first forward difference of I at
the second forward difference of I at
Xj
Xj
+
I1h.
by
by
and, continuing in tins way, the kth forward difference of I at
Xj
by
(12)
(k
=
1,2, .. ').
Examples and an explanation of the name "forward" follow on p. 806. What is the point
of this? We show that if we have regular spacing (11), then
(13)
We prove (13) by induction. It is true for k
=
1
I7
I because
Xl
=
.\"0
+ /z,
1
(II - Io) = -111
I::.Io·
• 7
SO
that
SEC. 19.3
805
Interpolation
Assuming (13) to be true for all forward differences of order k. we show that (13) holds
for k + 1. We use (8) with k + I instead of k; then we use (k + I)h = Xk+1 - .\"0' resulting
from (II). and finally (12) with) = 0, that is. j,k+lfo = !lokfl - j.kfo. This gives
(k
+
which is (13) with"
+
___
1 j.k
[ k!hk· fl
I)h
____
I
~k
fo
k!hk
]
•
1 instead of ". Formula (13) is proved.
In (10) we finally set x = .\"0 + rh. Then x - Xo = rh, x - XI = (r - 1)h since
Xo = h, and so on. With this and (13), formula (10) becomes Newton's (or
Gregory2-Newtoll's) forward difference interpolation formula
Xl
f(x)
= Pn(x) =
~
(:)
~sfo
(x
=
Xo
+
r = (x - xo)/17)
rh,
s=o
(14)
= fo + rj.fo +
1)
r(r -
2!
2
j. fo
r(r -
+ ... +
1) ... (r -
11
+
Il!
1)
j.nfo
where the binomial coefficients in the first line are defined by
(15)
(~) = I.
(:) = _r(_r_-_1)_(r_-_2)_s;_'_'_(r_-_s_+_I_)
(s
> O. integer)
and s! = 1 . 2 ... s.
Error.
(16)
From (5) we get. with.\" -
En(X)
=
f(x) - Pn(x)
=
.\"0
=
rh,
X -
hn + I
(n
+
1)!
r(r -
Xl
=
(r -
1)h. etc ..
1) ... (r - n)tn + ll (t)
with t as characterized in (5).
Formula (16) is an exact formula for the error, but it involves the unknown t. In Example
5 (below) we show how to use (16) for obtaining an enor estimate and an interval in
which the true value of f(x) must lie.
Comments on Accuracy. (A) The order of magnitude of the enor
to that of the next difference not used in Pn(x),
En(X)
is about equal
(B) One should choose xo, .... x 71 such that the x at which one interpolates is as well
centered between xo, . . . , Xn as possible.
2JA~ES GREGORY (1638-1675). SC?ts mathematician. professor a[ SI. Andrews and Edinburgh. ~ in (14)
and'\ (on p. 807) have nothmg [0 do WIth the Laplacian.
806
CHAP. 19
Numerics in General
The reason for (A) is that in (16),
II"(r __
- J)__________
... (r - 11)1
-cc
~~
~
1 ·2 ... (11
+
~
J)
if
Irl
~
]
(and actually for any I" as long as we do not extrapolate). The reason for (B) is that
11"(1" - 1) ... (r - n)1 becomes smallest for that choice.
E X AMP L E 5
Newton's Forward Difference Formula. Error Estimation
Compute cosh 0.56 from (14) and the four values in the following table and estimate the error.
cosh Xj
j
Xj
°
0.5
)~IP 626;
0.6
1.185465
fj =
2
t:. fj
t:.fj
1
(O.U5783')~
p.Ol] 865,
0.069704
2
t:,.3fj
0.7
,0 ..900697/
1.255169
0.012562
0.082266
0.8
I
1.337435
Solution. We compute the forward differences as shown in the table. The values we need are circled. In
(14) we have r = (0.56 - 0.50)/0.1 = 0.6, so that (14) gives
cosh 0.56 = 1.127626
=
+ 0.6' 0.057839 + 0.6(~OA) . 0.011 ROS + 0.6(-0.:)(-1.4) . 0.000 697
1.127 626 + 0.034 703 - 0.001 424 + 0.000039
= 1.16U 944.
Error estimate. From (16), since the founh derivative is cosh(4l t
0.1
1"3(0.56) =
where A = -0.000003 36 and 0.5 ~ t
and smallest cosh t in that interval:
=
cosh t,
4
4! ·0.6(-OA)(-1.4)(-2A) cosh I
=
A cosh t,
~
0.8. We do not know t, but we get an inequality by taking the largest
A cosh 0.8 ~ 1"3(0.62) ~ A cosh 0.5.
Since
f(xl = P3(x)
+
~(x),
this gives
P3(0.56) + A cosh 0.8 ~ cosh 0.56 ~ P3(0.56)
+ A cosh 0.5.
Numeric values are
1.I60 939
~
cosh 0.56
~
1.160 941.
The exact 6D-value is cosh 0.56 = 1.160941. It lies within these bounds. Such bounds are not always so tight.
•
Also, we did not consider roundoff errors, which will depend all the number of operations.
This example also explains the name ':forward difference formula": we see that the
differences in the formula slope forward in the difference table.
SEC. 19.3
807
Interpolation
Equal Spacing: Newton's Backward Difference Formula
Instead of forward-sloping differences we may also employ backward-sloping differences.
The difference table remains the same as before (same numbers, in the same positions), except
for a very harmless change of the running subscript j (which we explain in Example 6, below).
Nevertheless, purely for reasons of convenience it is standard to introduce a second name
and notation for differences as follows. We define the first backward difference of fat Xj by
the second bac/..:ward difference of f at
Xj
by
and, continuing in this way, the kth backward difference of f at Xj by
(17)
(k
= 1,2, .. ').
A formula similar to (14) but involving backward differences is Newton's (or
Gregory-Newton's) backward difference interpolation fonnula
f(x)
= Pn(x) = ~
(r + : -
(18)
=
E X AMP L E 6
fo
+
rVfo
+
r(r
+
])
1)
2!
VSfo
(x = Xo
2
r(1"
V fo
+ ... +
+
+
rh, r = (x - xo)lh)
1) ... (r
n!
+n-
1)
Vnfo·
Newton's Forward and Backward Interpolations
Compute a 7D-value of the Bessel function Jo(x) for x = 1.72 from the four values in the following table, using
(a) Newton's forward formula (14). (b) Newton's backward formula (18).
jfor
jbaclr
.Ij
Jo(.\j)
0
-3
1.7
0.3979849
1
-2
l.8
0.3399864
] st Diff.
2nd Diff.
3rd Diff.
-0.0579985
-0.0001693
-0.0581678
-]
2
l.9
0.2818186
0.0004093
0.0002400
-0.0579278
3
0
Solution.
I
2.0
0.2238908
The computation of the differences is the same in both cases. Only their notation differs.
(a) Forward. In (14) we have r = (1.72 - 1.70)/0.1 = 0.2, andj goes from
each column we need the first given number, and (14) thus gives
Jo(1.7 2) = 0.3979849
=
+ 0.2( -0.0579985) +
0.3979849 - 0.011 5997
a to 3 (see first column). In
0.2( -0.8)
0.2( -0.8)( -1.8)
2
(-0.000 1693) +
6
. 0.000 4093
+ 0.000 0135 + 0.000 0196
which is exact to 6D, the exact 7D-va1ue being 0.3864185.
=
0.3864183,
808
CHAP. 19
r
Numerics in General
(b) 8ack\\ard. For (18) we usej shown in the second column. and in each column the last number. Since
= (1.72 - 2.00)/0.1 = -2.8. we thus gel from lI8)
1 0 (1.72) = 0.223 89U8 - 2.8( -0.0579278)
+
-2.8(- 1.8)
2
. 0.000 2400 +
-2.8(
=
0.223 8908 + 0.162 1978 + 0.000 6048 - 0.000 2750
=
0.386 4184.
\.8)(-
6
0.8)
. 0.000 4093
•
Central Difference Notation
This is a third notation for differences. The first central difference of f(x) at Xj is defined
by
and the kth central difference of f(x) at
<:okf -
(19)
v
j -
Xj
by
<:ok-If
v
j+1/2 -
<:ok-If
v
(j
j-1/2
= 2,3, .. ').
Thus in this notation a difference table, for example, for f -1' fo, fI, f2, looks as follows:
X-I
f
Xo
fo
-I
[)f -1/2
[)2fo
[)3f1/2
[)fI/2
Xl
fl
X2
f2
[)2fl
[)f3/2
Central differences are used in numeric differentiation (Sec. 19.5), differential equations
(Chap. 21), and centered interpolation formulas (e.g., Everett's formula in Team Project
22). These are formulas that use function values "symmetrically" located on both sides
of the interpolation point x. Such values are available near the middle of a given table,
where centered interpolation formulas tend to give better results than those of Newton's
formulas, which do not have that "symmetry" property.
1. (Linear interpolation) Calculate PI (x) in Example I.
Compute from it In 9.4 = PI(9.4).
2. Estimate the enor in Prob. 1 by (5).
3. (Quadratic interpolation) Calculate the Lagrange
polynomial 1'2(X) for the 4D-values of the Gamma
function [(24), App. 3.1J r(l.oO) = 1.0000.
r(l.02) = 0.9888, r(l.04) = 0.9784, and from it
approximations of r{x) for x = 1.005, 1.010. 1.015.
1.025. 1.030. 1.035.
4. (Error bounds) Derive enor bounds for P2(9.2) in
Example 2 from (5).
S. (Error function) Calculate the Lagrange polynomial
P2(X) for the 50-values of the enor function
j(x) = erf x = (2/\'";) J~ e- dw, namely.
1(0.25) = 0.27633 ..«0.5) = 0.52050. f( I) = 0.84270.
and fromp2 an approximation of j(O.75) (= 0.71116.
50).
w2
6. Derive an error bound in Prob. 5 from (5).
7. (Sine integral) Calculate the Lagrange polynomial
P2(X) for the 40-values of the sine integral Si(x) [(40)
in App. 3.1], namely, Si(O) = O. Si(l) = 0.9461.
Si(2) = 1.6054, and from P2 approximations of Si(O.5)
(= 0.4931. 40) and Si(1.5) (= 1.3247,401.
8. (Linear and quadratic interpolation) Find e- O.25 and
O 75
e- . by linear interpolation with xo = 0, Xl = 0.5 and
Xo = 0.5, Xl = I. respectively. Then find pix)
SEC. 19.3
809
Interpolation
interpolating e- x with Xo = O. Xl = 0.5. X2 = 1 and
from it e- O.25 and e- O.75 • Compare the errors of these
linear and quadratic interpolations. Use 4D-values of e- x .
9. (Cubic Lagrange interpolation) Calculate and sketch
or graph L o, L 1 . L 2 , L3 for x = O. 1.2.3 on common
axes. Find P3(X) fOT the data
(0. I)
(1,0.765198)
(2, 0.223891 )
(3. -0.260052)
[values of the Bessel function 10(x)]. Find
= 0.5, 1.5. 2.5 by interpolation.
P3
for
X
10. (Interpolation and extrapolation) Calculate P2(X) in
Example 2. Compute from it approximations of In 9.4,
In 10, In 10.5. In 11.5. In 12, compute the errors by
using exact 4D-values. and comment.
11. (Extrapolation) Does a sketch or graph of the product
of the (x - Xj) in (5) for the data in Prob. 10 indicate
that extrapolation is likely to involve larger errors than
interpolation does?
12. (Lower degree) Find the degree of the interpolation
polynomial for the data
(-2.33)
(0.5)
(2.9)
(4,45)
(6, 113).
13. (Newton's forward difference formula) Set up (14)
for the data in Prob. 7 and derive P2(x) from (14).
14. Set up Newton's forward difference formula for the
data in Prob. 3 and compute [(1.01), [(1.03), [(1.05).
15. (Newton's divided difference formula) Compute
f(0.8) and f(0.9) from
f(0.5) = 0.479
f(I.O) = 0.841
17. (Central differences) Write the difference in the table
in Example 5 in central difference notation.
18. (Subtabulation) Compute the Bessel function 11(X) for
X = 0.1. 0.3, .... 0.9 from 1 1(0) = 0,11(0.2) = 0.09950.
h(O.4) = 0.19603. 1 1 (0.6} = 0.28670.11(0.8) = 0.36884,
11 (1.0) = 0.44005. Use (14) with II = 5.
19. (Notations) Compute a difference table of f(x) = x3
for X = O. 1, 2. 3.4. 5. Choose Xo = 2 and write all
occurring numbers in tenTIS of the notations (a) for
central differences, (b) for forward differences, (c) for
backward differences.
20. CAS EXPERIMENT. Adding Terms in Newton
Formulas. Write a program for the forward formula
( 14). Experiment on the increase of accuracy by
successively adding terms. As data use values of some
function of your choice for which your CAS gives the
values needed in determining errors.
21. WRITING PROJECT. Interpolation: Comparison
of Methods. Make a list of 5-6 ideas that you feel are
most basic in this section. Arrange them in the best
logical order. Discuss them in a 2-3 page report.
22. TEAM PROJECT. Interpolation and Extrapolation. (a) Lagrange practical error estimate (after
Theorem I). Apply this to PI (9.2) and P2(9.2) for the
data Xo = 9.0. Xl = 9.5. X2 = 11.0. fo = In Xo.
fl = In XI> f2 = In X2 (6S-values).
(b) Extrapolation. Given (.~i' f(x) = (0.2.0.9980).
(0.4. 0.9686). (0.6. 0.8443), (0.8, 0.5358), (1.0. 0).
Find f(0.7) from the quadratic interpolation
polynomials based on (a) 0.6, 0.8. 1.0, ({3) 0.4. 0.6.
0.8. (y) 0.2, 0.4, 0.6. Compare the errors and comment.
[Exact f(x) = cos (!7TX 2 ), f(O.7) = 0.7181 (45).1
(c) Graph the product of factors (x - xi> in the error
formula (5) for 11 = 2.' . '. 10 separately. What do
these graphs show regarding accuracy of interpolation
and extrapolation?
(d) Central differences. Show that
8 2 fm = fm+l - 2fm + fm-I> and. furthermore
8 3 fm+I12 = fm+2 - 3fYII f 1 + 3fm - f m - 1 •
n
8 fm = t!.."fm-n/2 = vnfm+n/2'
(e) Everett's interpolation formula
f(2.0) = 0.909
f(x)
=
(I - r)fo
by quadratic interpolation.
(20)
16. Compute f(6.5) from
+
(2 -
+
rf1
r)O -
3!
r)( -r)
2
8 fo
f(6.0) = O. J 506
f(7.0) = 0.3001
f(7.5)
= 0.2663
fO.7) = 0.2346
by cubic interpolation, using (10).
is an example of a formula involving only even-order
differences. Use it (0 compute the Bessel function Jo(x)
for x = 1.72 from 10(l.60} = 0.4554022 and 10 (1.7).
10(1.8),10(1.9) III Example 6.
810
19.4
CHAP. 19
Numerics in General
Spline Interpolation
Given data (function values, points in the .\y-plane) (.\"0' f 0), (XI- f 1)' . . . , (Xn , f n) can be
interpolated by a polynomial P,/x) of degree 11 or less so that the curve of P,ix) passes
through these 11 + 1 points (Xj, fj); here fo = f(xo), ...• fn = f(x,.). See Sec. 19.3.
Now if 11 is large, there may be trouble: P n(x) may tend to oscillate for x between the
nodes xo, ... , x",. Hence we must be prepared for numeric instability (Sec. 19.1). Figure
431 shows a famous example by C. Runge 3 for which the maximum error even approaches
x as 11 ~ x (with the nodes kept equidistant and their number increased). Figure 432
illustrates the increase of the oscillation with 11 for some other function that is piecewise
linear.
Those undesirable oscillations are avoided by the method of splines initiated by
I. J. Schoenberg in 1946 (Quarterly of Applied Mathematics 4, pp. 45-99, 112-141). This
method is widely used in practice. It also laid the foundation for much of modem CAD
(computer-aided design). its name is borrowed from a draftman's spline, which is an
elastic rod bent to pass through given points and held in place by weights. The
mathematical idea of the method is as follows:
Instead of using a single high-degree polynomial Pn over the entire interval a ~ x ~ b
in which the nodes lie, that is,
a
(1)
we use
11
=
Xo
< Xl < ... < Xn = b,
low-degree, e.g., cubic, polynomials
one over each subinterval between adjacent nodes. hence qo from Xo to Xl' then q1 from
YI
-5
"'--/
Fig. 431.
Fig. 432.
3
-~.
o
Runge's example {(x) = 1/(1
+ x 2 ) and
"'-../
0
5
x
interpolating polynomial PlQ(x)
Piecewise linear function {(x) and interpolation polynomials of increasing degrees
CARL RUNGE (1856-1927). German mathematician, also known for his work on ODEs (Sec. 21.1).
SEC. 19.4
811
Spline Interpolation
to X2' and so on. From this we compose an interpolation function g(x), called a spline.
by fitting these polynomials together into a single continuous curve passing through the
data points, that is.
Xl
Note that g(x) = qo(.\") when Xo ~ x ~ Xl' then g(x) = q1(X) when Xl ~ X ~ X2, and so
on, according to our construction of g.
Thus spline interpolation is piecewise polY1lomial intel]Jolatioll.
The simplest q/ s would be linear polynomials. However. the curve of a piecewise linear
continuous function has corners and would be of little interest in general-think of
designing the body of a car or a ship.
We shall consider cubic splines because these are the most important ones in
applications. By definition, a cubic spline g(x) interpolating given data (xo, f 0), ... ,
(xn , fn) is a continuous function on the interval a = Xo ~ X ~ X." = b that has continuous
first and second derivatives and satisfies the interpolation condition (2); furthermore,
between adjacent nodes, g(x) is given by a polynomial %(x) of degree 3 or less.
We claim that there is such a cubic spline. And if in addition to (2) we also require that
(3)
(given tangent directions of g(x) at the two endpoints of the interval a ~ X ~ b), then we
have a uniquely detennined cubic spline. This is the content of the following existence
and uniqueness theorem. whose proof will also suggest the actual determination of splines.
(Condition (3) will be discussed after the proof.)
THEOREM 1
Existence and Uniqueness of Cubic Splines
Let (xo, f 0), (xt> f1)' .•• , (xn , f.,,) with arbitrarily spaced givell Xj [see (I)] and given
fj = f(x),.i = 0, I, ... , Il. Let ko alld k n be allY given numbers. Then there is olle
and only one cubic spline g(x) corresponding to (1) alld satisfying (2) and (3).
PROOF
By definition, on every subintervallj given by Xj ~ x ~ Xj+ 1 the spline g(x) must agree
with a polynomial %lX) of degree not exceeding 3 such that
(4)
(j
= 0,
(j
= 0, 1, ... , n - I)
I, ... ,
17 -
I).
For the derivatives we write
(5)
with ko and kn given and k1, ••• , kn - 1 to be detemllned later. Equations (4) and (5) are
four conditions for each qj(x), By direct calculation, using the notation
(6*)
1
(j
= 0, 1, ... , Il
-
1)
we can verify that the unique cubic polynomial %(x) (j = 0, 1, ... , n - I) satisfying
812
CHAP. 19
Numerics in General
(4) and (5) is
%(X) = f(xj)c/(x - Xj+l)2[1
(6)
+
2clx - x)]
+
f(Xj+l)C/(X - X)2[1 - 2cj(x - Xj+l)]
+
kjc/(x - Xj)(x - Xj+l)2
+
kj + 1C/(X - X)2(X - Xj+l)'
Differentiating twice, we obtain
(8)
By definition, g(x) has continuous second derivatives. This gives the conditions
(j = I,' .. ,
11 -
I).
If we use (8) withj replaced by j - I, and (7), these n - I equations become
where ViJ = f(xj) - f(Xj-l) and Vfj+l = f(Xj+l) - f(xj) and j = 1,' .. , n - I, as
before. This linear system of 11 - I equations has a unique solution k1 , ••• , kn - 1 since
the coefficient matrix is strictly diagonally dominant (that is. in each row the (positive)
diagonal entry is greater than the sum ofthe other (positive) entries). Hence the determinant
of the matrix cannot be zero (as follows from Theorem 3 in Sec. 20.7), so that we may
determine unique values k] • ... , kn - 1 of the first derivative of g(x) at the nodes. This
proves the theorem.
•
Storage and Time Demands in solving (9) are modest. since the matrix of (9) is sparse
(has few non7ero entries) and tridiagonal (may have nonzero entries only on the diagonal
and on the two adjacent "parallels" above and below it). Pivoting (Sec. 7.3) is not necessary
because of that dominance. This makes splines efficient in solving large problems with
thousands of nodes or more. For some literature and some critical comments, see American
Mathematical Monthly 105 (1998), 929-941.
Condition (3) includes the clamped conditions
(10)
in which the tangent directions t' (xo) and t' (x,,) at the ends are given. Other conditions
of practical interest are the free or natural conditions
(11)
(geometJically: zero curvature at the ends, as for the draftman's spline), giving a natural
spline. These names are motivated by Fig. 290 in Problem Set 12.3.
SEC. 19.4
813
Spline Interpolation
Determination of Splines. Let ko and kn be given. Obtain kl' .... k n - l by solving the
linear system (9). Recall that the spline g(x) to be found consists of n cubic polynomials
qo, ... , qn-l' We write these polynomials in the form
(12)
where j = O•...•
11 -
I. Using Taylor's formula. we obtain
ajO = q/x) = Ij
by (2),
q; (Xj) = kj
by (5),
ajl
=
by (7),
(13)
with aj3 obtained by calculating q;'(Xj+l) from (12) and equating the result to (8), that is,
and now subtracting from this 2aJ 2 as given in (13) and simplifying.
Note that for equidistant nodes of distance hj = h we can write cJ =
and have from (9) simply
= 1111 in (6*)
(j = 1, ... ,
(14)
E X AMP L E 1
C
11 -
I).
Spline Interpolation. Equidistant Nodes
Interpolate f(x) = .,.4 on the interval -I :::;:: x :::;:: I by the cubic spline g( r) corresponding to the nodes
Xo = -1. Xl = O. X2 = 1 and satisfying the clamped condition, g' (-1) = f' (- I), g '(1) = f' (I).
Solution.
We have h
In our ,tandard notation the given data are fo = f( - \) = 1. II = frO) = O.
= 1 and 11 = 2, so that our spline consists of 11 = 2 polynomials
qo(x)
= 1I00
ql(X) = 1I1O
+
lIOl("
+
+ 1Il1-~ +
1)
+
1I12X2
lI02(x
+
1)2
+
lI03(x
+
1)3
+ {/13..3
f2
=
f( 1)
(-\ :::;:: x:::;:: 0).
(0 :::;:: x:::;:: I).
We determine the kj from (14) (equidistance!) and then the coefficients of the spline from (13). Since
the sp,tem (14) is a ,ingle equation (with j = 1 and h = I)
ko
Here fo = f2 = \ (the value of x4 at the ends) and ko
- I and 1. Hence
+
4kl
11
= 2,
+ 4kl + k2 = 3(f2 - fo)·
= -4.
k2
= 4. the value~ of the derivative 4\·3 at the
end~
-4
= 1.
+
4 = 3(1 -
I) = 0,
From (13) we can now obtain the coefficients of qo, namely, {loo
=
fo
= 1,
{lOI
= ko = -4. and
814
CHAP. 19
Numerics in General
3
a02
= ]2 (h - Io)
a03
=
2
3(Io I
1
-1 (k1 +
h) +
2ko)
I
"2(kl + ko)
I
=
3(0 - 1) - (0 - 8)
= 2(1 -
Similarly, for the coefficients of qi we obtain from (13) the values
a12
=
3(I2 -
h) -
a13 = 2(h - I2)
(k2
0)
+
(0 -
0lO
=
II
4)
=
0,
=
5
= -2.
all
=
kl
=
0, and
+ 2k1 ) = 3(1 - 0) - (4 + 0) = -1
= 2(0
+ (k2 + k 1)
-
I) + (4 + 0)
= 2.
This gives the polynomials of which the spline g(x) consists. namely,
if
if
-I
~x~O
O~x~l.
Figure 433 shows .f(x) and this spline. Do you see that we could have saved over half of our work by using
symmetry?
•
((x)
x
Fig. 433.
E X AMP L E 2
Function fIx) =
X4
and cubic spline g(x) in Example 1
Natural Spline. Arbitrarily Spaced Nodes
Find a spline approximation and a polynomial approximation for the curve of the cross section of the circularshaped Shrine of the Book in Jerusalem shown in Fig. 434.
'"
~
•
•
;; I
T:
Jl
~
-3
Fig. 434.
-2
0
Shrine of the Book in Jerusalem (Architects F. Kissler and A. M. Bartus)
SEC. 19.4
Spline Interpolation
815
Solution.
Thirteen points. about equally distributed along the contour (nut along the x-axis!), give these data:
Xj
5.8
Ij
0
-5.0
-4.0
-2.5
-1.5
-0.8
1.5
1.8
2.2
2.7
3.5
0
0.8
1.5
2.5
4.0
5.0
5.8
3.9
3.5
2.7
2.2
1.8
1.5
0
The figure shows the conesponding interpolation polynomial of 12th degree, which is useless because of its
oscillation. (Because of roundoff your software will also give you small enor terms involving odd powers of
x.) The polynomial is
P12(X) = 3.9000 - 0.65083..2
+ 0.033858x4 + 0.01l04lx 6
+ 0.000055595x 10
-
-
0.00I40I0x 8
0.00000071 867x 12 .
The spline follows practically the contour of the roof, with a small error near the nodes -O.li and 0.8. The spline
is symmetric. Its six polynomials corresponding to positive x have the fullowing coefficieuts of their
represeutations (12). (Note well that (12) is in terms of powers of x - Xj, uot x!)
I
I
j
x-interval
ajO
ajl
aj2
aj3
0
0.0... 0.8
0.8. .. 1.5
[.5 .. .2.5
2.5 .. .4.0
4.0 ... 5.0
5.0 ... 5.8
3.9
3.5
2.7
2.2
1.8
1.5
0.00
-1.01
-0.95
-0.32
-0.027
-1.13
-0.61
-0.65
0.73
-0.09[
0.29
-1.39
-0.015
0.66
-0.27
0.084
-0.56
0.58
1
2
3
4
5
- --
....._...._-.
..... _........... -.
~
1. WRITING PROJECT. Splines. In your own words,
and using as few formulas as possible, write a short
report on spline interpolation, its motivation, a
comparison with polynomial interpolation, and its
applications.
2. (Individual polynomial qj) Show that qj(x) in (6)
satisfies the interpolation condition (4) as well as the
derivative condition (5).
3. Verify the differentiations that give (7) and (8) from
(6).
4. (System for derivatives) Derive the basic linear
system (9) for k1 , . . . , k n - 1 as indicated in the text.
5. (Equidistant nodes) Derive (14) from (9).
6. (Coefficients) Give the details of the derivation of aj2
and aj3 in (13).
7. Verify the computations in Example I.
8. (Comparison) Compare the spline g in Example I with
the quadratic interpolation polynomial over the whole
interval. Find the maximum deviations of g and P2 from
f. Comment.
9. (Natural spline condition) Using the given
coefficients, verify that the spline in Example 2
satisfies g"(x) = 0 at the ends.
110-161
DETERMINATION OF SPLINES
Find the cubic spline g(x) for the given data with ko and k n
as given.
10. f( -2) = .f( -1)
ko = k4 = 0
=
f(1)
=
f(2)
=
O. f(O)
= I.
11. If we started from the piecewise linear function in
Fig. 435. we would obtain g(x) in Prob. 10 as the spline
satisfying g' (-2) = f' (-2) = 0, g' (2) = f' (2) = O.
Find and sketch or graph the corresponding
interpolation polynomial of 4th degree and compare it
with the spline. Comment.
I
~2_-&-----=--2
-----/
0
----
x
Fig. 435. Spline and interpolation
polynomial in Problems 10 and 11
12. fo = f(O)
h = f(6)
1, fl = f(2) = 9, f2 = f(4)
41, ko = 0, k3 = -12
= 41,
816
CHAP. 19
Numerics in General
13. fo = f(-I) = O. fl = f(O) = 4. f2 = f(1) = O.
ko = 0, k2 = O. Is g(x) even? (Give reason.)
14. fo = f(O) = o . .ft = fO) = L f2 = f(Z) = 6.
f3 = f(3) = 10. ko = 0, k3 = 0
15. fo = f(O) = \. fl = fO) = O. f2 = feZ) = - \ .
f3 = f(3) = 0.1..0 = O. k3 = -6
16. It can happen that a spline is given by the same
polynomial in two adjacent subintervals. To illustrate
this, find the cubic spline g(x) for f(x) = sin x
corresponding to the partition Xo = -7TI1. Xl = 0,
X2 = 7T/2 of the interval -7T/Z ~ x ~ 7T/Z and
satisfying g'(-7T/Z) = f'(-7T/2) and
in components.
\"(t) = Xo
+
y(t) = Yo
+
+
X~f
+
(2(xo -
+
Y~f
(2(yo -
(3(Xl -
Xl)
+
+
X~
(3(Yl YI)
xo) -
(2x~
+
3
X;)t
l'O) -
+ y~ +
-t- X;»f2
(2y~ + y;»t 2
y;)£3.
Note that this is a cubic Hennite interpolation
polynomiaL and 11 = I because we have two nodes (the
endpoints of C). (This has nothing to do with the
Hermite polynomials in Sec. 5.S.) The two points
g'(7T/Z) = f'(7T/2).
17. (Natural conditions) Explain the remark after (II).
18. CAS EXPERIMENT. Spline versus Polynomial. [f
your CAS gives natural splines, find the natural splines
when x is integer from -111 to Ill, and yeO) = I and all
other y equal to O. Graph each such spline along with
the interpolation polynomial P2m' Do this for 111 = Z to
10 (or more). What happens with increasing 111?
19. If a cubic spline is three times continuously differentiable
(that is, it has continuous first, second. and third
derivatives). show that it must be a single polynomial.
20. TEAM PROJECT. Hermite Interpolation and
Bezier Curves. In Hermite interpolation we are
looking for a polynomial p(x) (of degree ZI1 + I or less)
such that p(x) and its derivative p' (x) have given values
at 11 + I nodes. (More generally, p(x). p' (x), p"(xJ, ...
may be required to have given values at the nodes.)
(a) Curves with given endpoints and tangents. Let
C be a curve in the x),-plane parametrically represented
by ret) = [x(t), y(t)], 0 ~ t ~ I (see Sec. 9.5). Show
that for given initial and terminal points of a curve and
given initial and terminal tangents. say,
A:
ro
=
[x(O).
yeo)]
[xo, Yo].
[Xl'
Vo
=
x~. Yo + Y~]
= [Xl -
x;, YI
-
Y;]
are called guidepoints because the segments AGA and
BGB specify the tangents graphically. A, 8, GA , GB
determine C. and C can be changed quickly by moving
the points. A curve consisting of such Hemlite
interpolation polynomials is called a Bezier curve.
after the French engineer P. Bezier of the Renault
Automobile Company. who introduced them in the
early 1960s in designing car bodies. Bezier curves (and
surfaces) are used in computer-aided design (CAD) and
computer-aided manufacturing (CAM). (For more
detaib, ~t!e Ref. [E21] in App. \.)
(b) Find
and graph the Bezier curve and its
guidepoints if A: [0, 0]. 8: [I, 0], Vo = [i,
VI = [-i, -!vi3].
n
(c) Changing guidepoints changes C. Moving
guide points farther away makes C "staying near the
tangents for a longer time." Confirm this by changing
Vo and VI in (b) to Zvo and 2Vl (see Fig. 436).
(d) Make experiments of your own. What happens if
you change VI in (b) to -VI' If you rotate the tangents?
If you multiply Vo and VI by positive factors less
than I?
[x(l). y(J)]
r1
8:
= [xo +
and
yd
[x'(O), ),'(0)]
[x~, y~].
VI
[x'(I),
[x~.
y'(I)]
y;]
we can find a curve C, namely,
reT)
(15)
=
ro + vot
+
+
(3(r 1
ro) - (ZVo
(Z(r o - r 1 )
+
Vo
+
+
vtl)t 2
V 1 )t 3 ;
B
Fig. 436.
x
Team Project 20(b) and (c): Bezier curves
SEC. 19.5
19.5
817
Numeric Integration and Differentiation
Numeric Integration and Differentiation
Numeric integration mean<; the numeric evaluation of integrals
J =
I
b
f(x) dx
a
where a and b are given and f is a function given analytically by a formula or empirically by
a table of values. Geometrically, J is the area under the curve of f between a and b (Fig. 437).
We know that if f is such that we can find a differentiable function F whose derivative
is f. then we can evaluate J by applying the familiar formula
J =
I
b
f(x) dx
=
[F' (x) =
F(h) - F(a)
f(x)j.
a
Tables of integrals or a CAS (Mathematica. Maple, etc.) may be helpful for this purpose.
However, applications often lead to integrals whose analytic evaluation would be very
difficult or even impossible, or whose integrand is an empirical function given by recorded
numeric values. Then we may obtain approximate numeric values of the integral by a
numeric integration method.
Rectangular Rule. Trapezoidal Rule
Numeric integration methods are obtained by approximating the integrand f by functions
that can easily be integrated.
The simplest formula. the rectangular rule. is obtained if we subdivide the interval of
integration a;:::; x;:::; b into /l subintervals of equal length II = (b - a)//l and in each subinterval
approximate f by the constant f(x/), the value of f at the midpoint x/ of the jth subinterval
(Fig. 438). Then f is approximated by a step function (piecewise constant function). the 11
rectangles in Fig. 438 have the areas f('\·I*)I1 • ... , f(xn*)I1, and the rectangular rule is
J =
(1)
I
b
f(x)dx = h[f(Xl*)
+
f(X2*)
+ ... + f(xn*)]
a
( b-ll)
/7=--n
.
The trapezoidal rule is generally more accurate. We obtain it if we take the same
subdivision as before and approximate f by a broken line of segments (chords) with
endpoints [a, f(a)], [Xl> f(Xl)], ... , [b, feb)] on the curve of f (Fig. 439). Then the area
under the curve of f between a and b is approximated by n trapezoids of areas
Mf(a) + f(Xl)]h,
![f(Xl)
+ f(X2)]h,
![f(Xn-l)
y
y
y={(x)
~
R
a
b
X
Fig. 437. Geometric interpretation
of a definite integral
+
f(b)]h.
r/oK
11 '" )
I
I
a xt
X·
2
Fig. 438.
X
I
* b
n
Rectangular rule
x
CHAP. 19
818
Numerics in General
y
!~~ )
'I
o(
00
x
Trapezoidal rule
Fig. 439.
By taking their slim we obtain the trapezoidal rule
(2)
]
I
=
b
lex) dx = h[!l(a)
+
f(X1)
+
f(X2)
+ ... +
f(Xn-l)
+
!f(b)]
a
=
where h
E X AMP L E 1
(b - a)/n, as in (1). The
x/s and a and b are called nodes.
Trapezoidal Rule
=
Evaluate J
f
1
e -:? dx by means of (2) with
11
= 10.
o
Solution.
Table 19.3
j
0
2
3
4
5
6
7
8
9
10
J = 0.1(0.5·1.367879
+ 6.778167)
=
0.746211 from Table 19.3.
•
Computations in Example 1
Xj
X/
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0
0.01
0.04
0.09
0.16
0.25
0.36
0.49
0.64
0.81
1.00
Sums
1.000000
0.367879
1.367879
Error Bounds and Estimate for the Trapezoidal Rule
An error estimate for the trapezoidal rule can be derived from (5) in Sec. 19.3 with
n = 1 by integration as follows. For a single subinterval we have
f(x) - P1(X)
=
f'(t)
(x - Xo)(X - Xl) - -
2
SEC 19.5
819
Numeric Integration and Differentiation
with a suitable t depending on x, between Xo and Xl. Integration over X from a
Xl = Xo + h gives
J
Iz
xo+h
~
f(x) dx - -
2
J
+ f(XI)] =
(y - xo)(x - Xo - II)
~
Xo to
f"(t(x»
xo+h
[fCyo)
=
2
dx.
Setting X - Xo = v and applying the mean value theorem of integral calculus, which we
can use because (x - xo)(x - Xo - h) does not change sign, we find that the right side
equals
{'Ci)
v(v - III dv - o
2
f
(3*)
h
=
3
( h
3
-
3
3
h ) f"Ci)
h
-- = - 2
2
12
-
_
f " (t)
where t is a (suitable, unknown) value between Xo and Xl. This is the error for the
trapezoidal rule with n = 1, often called the local error.
Hence the error E of (2) with any n is the sum of such contributions from the
n subintervals; since /z = (b - a)/n. nh 3 = neb - a)3/1l 3• and (b - a)2 = n 2/z 2• we obtain
(3)
E=-
(b - a)3 II ~
(b - a) 2 II ~
12n 2
fU)=12
hf(t)
with (suitable, unknown) i between a and b.
Because of (3) the trapezoidal rule (2) is also written
(2*)
1 =
b
[1
1
J f(x) dx = /z "2f(a) + f(XI) + ... + f(Xn-I) + "2f(b)
a
]- b - a 2" ~
- h f (t).
12
Error Bounds are now obtained by taking the largest value for f", say, M 2 , and the
smallest value, M 2 *. in the interval of integration. Then (3) gives (note that K is negative)
(4)
where
(b - a)3
K = - ---=-2
12n
Error Estimation by Halving h is advisable if h" is very complicated or unknown,
for instance, in the case of experimental data. Then we may apply the Error Principle
of Sec. 19.1. That is, we calculate by (2), first with h. obtaining, say, 1 = 1h + Eh, and
then with ~/z, obtaining 1 = h/2 + Eh/2. Now if we replace /z2 in (3) with (~h)2, the error
is multiplied by 1/4. Hence Eh/2 = iEh (not exactly because i may differ). Together,
1h/2 + Eh/2 = 1h + Eh = 1h + 4Eh/2. Thus 1h/2 - 1h = (4 - l)Eh/2. Division by 3
gives the error formula for 1h/2
(5)
E X AMP L E 2
Error Estimation for the Trapezoidal Rule by (4) and (5)
Estimate the error of the approximate value in Example I by (4) and (5).
Solution.
o
(A) Error boullds by (4). By differentiation, f"(x) = 2(2x 2
-
l)e- x2 . Also, {"(x) > 0 if
< x < 1, so that the minimum and maximum occur at the ends of the interval. We compute
820
CHAP. 19
M2
=
Numerics in General
f"(I)
= 0.735759 and
M2*
= ('(0) = -2. Furthenl1ore.
-0.000 614
~ E ~
K
= -1/1200. and (4) gives
0.001 (,67.
Hence the exact value of 1 must lie between
0.74(,211 - 0.000 fil4 = 0.745597
0.74(, 21 1
and
+ 0.001
6(,7 = 0.747 87R.
Actually, 1 = 0.746 824, exact 10 60,
(8) Error estimate by (5). lh = 0,746211 in Example I. Also.
lh/2 =
0,05
[~ e-cil2ol' + + + 0.367R79)]
(1
0,74(,671.
=
J~l
Hence
Eh/2
=
l(Jh/2 -
1,,)
= 0,000153 and lh/2 +
E"/2
•
= 0,746824. exact (0 60,
Simpson's Rule of Integration
Piecewise constant approximation of f led to the rectangular rule (I), piecewise linear
approximation to the trapezoidal rule (2), and piecewise quadratic approximation will lead
to Simpson's rule. which is of great practical importance because it is sufficiently accurate
for most problems, but still sufficiently simple.
To derive Simpson's rule, we divide the interval of integration a 2 x 2 b into an even
Ilumber of equal subintervals. say, into 11 = 2m subintervals of length h = (b - a)/(2m),
with endpoints Xo (= a), Xl, ... , X2m-1' X2m (= b); see Fig. 440. We now take the first
two subintervals and approximate f(x) in the interval Xo 2 X 2 X2 = Xo + 211 by the
Lagrange polynomial P2(X) through (xo, fo). (XIo f1), (X2, f2), where f j = f(xj). From (3)
in Sec. 19.3 we obtain
The denominators in (6) are 2h2, -112, and 2172, respectively. Setting s
have
X -
Xl = sl1,
Xo = x - (Xl - h) = (s
X -
X - X2 = X - (Xl
+ 11)
=
+ l}h
1)17
= (s -
and we obtain
P2(X) = !s(s - l)fo - (s
y
+ 1)(S - l)f1 + !(S + l)sf2'
r; -...:.:
-J
First parabola
_ ,./'' Second parabola
rl~
rI
~ Lo'T~"
..
I~Y,
I
I
I
x
Fig. 440.
Simpson's rule
(x - x1)111, we
SEC. 19.5
821
Numeric Integration and Differentiation
We now integrate with respect to x from Xo to X2' This corresponds to integrating with
respect to s from -1 to 1. Since dx = h ds, the result is
J
X2
(7*)
=
f(x) dx
Xv
J""2P2(X) dx = h ( -31fo + -34f1 + "3I)
f2 .
Xv
A similar formula holds for the next two subintervals from X2 to -'"4, and so on. By summing
all these 111 formulas we obtain Simpson's rule4
(7)
I
h
b
f(x) dx = -
3
a
+ 4f1 +
(fo
where h = (b - a)/(211l) and fj
rule.
Table 19.4
2f2
+
4f3
+ ... + 2f2m-2 +
4f2m-1
+
f2m),
f(xj)' Table 19.4 shows an algorithm for Simpson's
=
Simpson's Rule of Integration
ALGORITHM SIMPSON (a, b,
111,
fo, f1, ... , f2m)
This algorithm computes the integral j = Jgf(x) dx from given values fj = f(xj) at
equidistant Xo = {/, Xl = Xo + h . .... X2m = Xo + 2mh = b by Simpson's rule (7).
where h = (b - a)/(2m).
INPUT:
a, b,
fo • ... , f2m
Approximate value J of j
OUTPUT:
Compute
111.
fo
+
f2m
= f2
+
f4
So =
S2
+ ... +
f2m-2
h = (b - a)/2111
_
j
h
=
-
3
(so
+ 4s1 +
2s2 )
OUTPUT J Stop.
End SIMPSON
Error of Simpson's Rule (7). If the fourth derivative
a ::::; x ::::; b, the error of (7), call it ES' is
(8)
ES
=-
(b - a)5
180(2171)4
(4) At
_
f ()-
t<4) exists and is continuous on
_(b_-_G_J h 4f(4)(i)'
180
'
~HOMAS SIMPSON (1710-1761), self-taught English mathematician. author of several popular textbooks.
Simpson's rule was used much earlier by Torricelli, Gregory (in 1668), and Newton (in 1676).
822
CHAP. 19
Numerics in General
here t is a suitable unknown value between a and b. This is obtained similarly to (3). With
this we may also write Simpson's rule (7) as
(7**)
Error Bounds. By taking for t<4) ill (8) the maximum M4 and minimum M4 * on the
interval of integration we obtain from (8) the error bounds (note that C is negative)
C
where
(9)
(b - a)5
= - ----:180(2m)4
Degree of Precision (DP) of an integration f0I111ula. This is the maximum degree of
arbitrary polynomials for which the formula gives exact values of integrals over any
intervals.
Hence for the trapezoidal rule,
DP
=
I
because we approximate the curve of f by portions of straight lines (linear polynomials).
For Simpson's rule we might expect DP = 2 (why?). Actually,
DP = 3
by (9) because /4) is identically zero for a cubic polynomiaL This makes Simpson's nile
sufficiently accurate for most practical problems and accounts for its popUlarity.
Numeric Stability with respect to rounding is another impOltant property of Simpson'~
nile. Indeed, for the sum of the roundoff errors Ej of the 2m + I values f j in (7) we obtain.
since Iz = (b - a)12111,
h
-3 lEo
+
4El
+ ... +
E21111 ~
(b - a)
3·2m
6111u
= (b - a)u
where u is the rounding unit (u = ~. 10- 6 if we round off to 6D; see Sec. 19.1). Also
6 = I + 4 + I is the sum of the coefficients for a pair of intervals in (7); take 111 = I in
(7) to see this. The bound (b - (I) u is independent of Ill, so that it cannot increase with
increasing 11l, that is, with decreasing h. This proves stability.
•
Newton-Cotes Formulas. We mention that the trapezoidal and Simpson rules are
special closed Newton-Cafes formulas, that is, integration formulas in which f(x) is
interpolated at equalIy spaced nodes by a polynomial of degree n (n = I for trapezoidal,
Il = 2 for Simpson), and closed means that a and b are nodes (a = .ro, b = xn). Il = 3
(the three-eighths nile; Review Prob. 33) and a higher n are used occasionally. From
n = 8 on, some of the coefficients become negative, so that a positive f· could make a
~egative contribution to an integral, which is absurd. For more on this topi~ see Ref. [E25]
m App. I.
SEC 19.5
823
Numeric Integration and Differentiation
E X AMP L E 3
Simpson's Rule. Error Estimate
Evalume J =
f
Solution.
Since h = 0.1, Table 19.5 give,
1
e -:i'- dx by Simpson's rule with 2m = 10 and estimate the error.
o
0.1
3
J =
(1.367879 + 4· 3.740 266 + 2· 3.037 901) = 0.746825.
Estimate of error. Differentiation gives ( 4 )(x) = 4(4..4 - 121.2 + 3)e-:i'-. By considering the derivative /5)
of /4) we find that the largest value of /4) in the interval of integration occurs at 0 and the smallest value at
r'" = (2.5 - 0.5"\' loi J2 . Computation gives the values M4 = /4)(0) = 12 and 1114* = /4)(x*) = -7.419.
Since 2m = 10 and b - a = I, we obtain C = - 111 800 000 = -0.000 000 56. Therefore. from (9).
-0.000 007
~ ES ~
0.000 005.
Hence J must lie between 0.746825 - 0.000 007 = 0.746818 and 0.746825 + 0.000005 = 0.746830,
so that at least four digits of our approximate value are exact. Actually. the value 0.746825 is exact to 5D because
J = 0.746824 (exact to 6D).
Thus our result is much better than that in Example 1 obtained by the trapeLOidal rule. whereas the number
of operations is nearly the same in both cases.
•
Table 19.5
j
0
Computations in Example 3
Xj
xl
0
0
0.1
0.01
2
0.2
0.04
3
4
5
6
7
8
9
[0
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.09
0.16
0.25
0.36
0.49
0.64
0.81
1.00
e- Xj
1.000000
0.990050
0.960789
0.913931
0.852144
0.778801
0.697676
0.612626
0.527 292
0.444 858
0.367879
1.367879
Sums
2
3.74U 266
3.037 YUl
Instead of picking an n = 2m and then estimating the error by (9), as in Example 3, it is
better to require an accuracy (e.g., 6D) and then determine 11 = 2111 from (9).
EX AMP L E 4
Determination of n
What
II
=
2m in Simpson's Rule from the Required Accuracy
should we choose in Example 3 to get 6D-accuracy?
Solution.
Using M4 = 12 (which i, bigger in absolute value than M 4 *). we get from (9), with b and the required accuracy.
thus
2'106'12
m = [
180·2
4
J1I4
a=
1
= 9.55.
Hence we should choose 11 = 2111 = 20. Do the computation, which paralleb that in Example 3.
Note that the error bounds in (4) or (9) may sometimes be loose, so that in such a case a smaller 11 = 2m
may already suffice.
•
824
CHAP. 19
Numerics in General
Error Estimation for Simpson's Rule by Halving h.
and gives
The idea is the same as in (5)
(10)
lit is obtained by using hand lh/2 by using ~h. and Eh/2 is the error of l h12 .
Derivati()n. In (5) we had ~ as the reciprocal of 3 = 4 - I and ~ = (~)2 resulted from
1z2 in (3) by replacing II with ~h. In (0) we have l~ as the reciprocal of 15 = 16 - 1 and
{6 = (~)4 results from /1 4 in (8) by replacing h with ~II.
E X AMP L E 5
Error Estimation for Simpson's Rule by Halving
Integrate flX) = l'1TX4 coslm- from 0 to 2 with" = I and apply (10).
Solution.
The exact 5D-value of the integral is 1 = 1.25953. Simp,on's rule gives
h = Hf(O} +
4f(l) + f(2)] =
i [.f(0)
11</2 =
I
= "6 LO +
+
l(O + 4· 0.555360 + 0) = 0.740480.
4.f( +) + 2.f(l) + 4J (%) + .f(2)]
4' 0.045351 + 2 ·0.555361 + 4· 1.521579 + 0]
=
1.21974.
Hence (10) gives e1>/2 = if;(1.21974 - 0.74048) = 0.032617 and thus 1 = 1h/2 + E1</2 = 1.26236. with an
error -0.00283. which is less in absolute value than
of the error 0.02979 of 1,,/2' Hence the use of (10) was
well worthwhile.
•
to
Adaptive Integration
The idea is to adapt step h to the variability of f(x). That is, where f varies but little, we
can proceed in large steps without causing a substantial error in the integraL but where .f
varies rapidly, we have to take small steps in order to stay everywhere close enough to
the curve of f.
Changing h is done systematically, usually by halving 11, and automatically (not "by
hand"') depending on the size of the (estimated) error over a subinterval. The subinterval
is halved if the cOlTesponding error is still too large, that is, larger than a given tolerance
TOL (maximum admissible absolute en-or), or is not halved if the error is less than or
equal to TOL.
Adapting is one of the techniques typical of modern software. In connection with
integration it can be applied to various methods. We explain it here for Simpson's rule.
In Table 19.6 a star means that for that subinterval, TOL has been reached.
E X AMP L E 6
Adaptive Integration with Simpson's Rule
4
Integrate J(x) = l'1Tv cos l'1TX from x = 0 to 2 by adaptive integration and with Simpson's rule and
TOLfO. 2J = 0.0002.
Solutioll.
Table 19.6 shows the calculations. Figure 441 show, the integrand ffx) and the adapted intervals
used. The first two intervals ([0. 0.5], [0.5, 1.0j) have length 0.5. hence h = 0.25 [because we use 2111 = 2
subintervals in Simpson's rule (7**)1· The next two interval~ ([1.00. 1.25J. [1.25. 1.50]) have length 0.25 (hence
h = 0.125) and the la~t four intervals have length 0.125. Sample computatio"s. For 0.7-10480 ~ee Example 5.
Formula (10) gives (0.123716 - 0.122794)115 = 0.000061. Note that 0.123716 refers to [0. 0.5] and [0.5. 11.
so that we must subtract the value conesponding to [0, I] in the line before. Etc. TOL[O, 2] = 0.0002 gives
SEC. 19.5
825
Numeric Integration and Differentiation
0.0001 for subintervals oflength 1,0.00005 for length 0.5. etc. The value of the integral obtained is the sum of
the values marked by an asterisk (for which the error estimate has become less than TaL). This gives
J = 0.123716
+
0.52~8l)5
+ 0.388263 + 0.218483
=
1.25936.
The exact 5D-value is J = 1.25953. Hence the error is 0.00017. This is about 11200 of the absolute value of
that in Example 5. Our more extensive computation has produced a much better result.
•
Table 19.6
Computations in Example 6
Interval
Integral
Error (10)
TOL
Comment
[0,2]
0.740480
[0. 11
[1, 2]
0.122794
1.10695
Sum = 1.22974
0.032617
0.0002
Divide fut1her
0.UU4782
0.118934
Sum = 0.123716*
0.000061
0.0001
TOL reached
0.528176
0.605821
Sum = 1.13300
0.001803
0.0001
Divide further
0.200544
0.328351
Sum = 0.528895*
0.000041<
0.00005
TOL reached
0.388235
0.218457
Sum = 0.606692
0.000058
0.00005
Divide further
0.1%244
0.192019
Sum = 0.388263*
0.000002
0.000025
TOL reached
0.153405
0.065078
Sum = 0.218483*
0.000002
0.000025
TOL reached
[U.O, 0.5]
[0.5, 1.0]
[1.0. 1.5]
[1.5, 2.0]
[1.UU, 1.25]
l1.25. 1.50)
[1.50. 1.75]
[1.75, 2.00]
[1.500, 1.625]
[1.625, 1.750]
0.0002
-
[1.750. 1.875]
[1.875, 2.000]
{(xl
1.5
1.0
0.5
o o
Fig. 441.
0.5
x
Adaptive integration in Example 6
826
CHAP. 19
Numerics in General
Gauss Integration Formulas
Maximum Degree of Precision
Our integration fonnulas discussed so far use function values at predetermined
(equidistant) x-values (nodes) and give exact results for polynomials not exceeding a
certain degree [called the degree of precision; see after (9)]. But we can get much more
accurate integration formulas as follows. We set
I
(11)
n
1
f(t) dt =
~ AJfj
j~l
-1
with fixed 11, and t = ± I obtained from x = a, b by setting x = HaCt - I) + bet + 1)].
Then we detennine the 11 coefficients AI' ... , AT! and 11 nodes t l •••• , tn so that (II)
gives exact results for polynomials of degree k as high as possible. Since 11 + 11 = 211 is
the number of coefficients of a polynomial of degree 211 - 1, it follows that k ~ 2n - I.
Gauss has shown that exactne!>s for polynomials of degree not exceeding 2n - 1 (instead
of n - I for predetermined nodes) can be attained, and he has given the location of the
tj (= the jth zero of the Legendre polynomial P n in Sec. 5.3) and the coefficients Aj which
depend on 11 but not on f(t), and are obtained by using Lagrange' s interpolation polynomial,
as shown in Ref. [E5] listed in App. I. With these tj and Aj , formula (11) is called a Gauss
integration formula or Gauss quadrature formllla. Its degree of precision is 211 - I, as
just explained. Table 19.7 gives the values needed for n = 2 .... , 5. (For larger 11. see
pp. 916-919 of Ref. [GRI] in App. 1.)
Table 19.7
n
2
3
4
5
E X AMP L E 7
Gauss Integration: Nodes tj and Coefficients Aj
Nodes
-0.57735 02692
1
0.57735 02692
1
-0.7745966692
0
0.7745966692
0.55555 55556
0.88888 88889
0.55555 55556
-0.8611303116
-0.33998 10436
0.3478548451
0.6521451549
0.33998 10436
0.8611363116
0.6521-l 51549
0.34785 4845 I
-0.90617 98459
-0.5384693101
0
0.5384693101
0.9061798459
0.23692 68851
0.47862 86705
0.56888 88889
0.47862 86705
0.23692 6885 I
Gauss Integration Formula with n
Degree of Precision
Coefficients Aj
Ij
=
3
5
7
9
3
Evaluate the mtegral in Example 3 by the Gauss integration formula (I I) with
/I
=
3.
Solutio'!.." 1 w~ have to c~nvert our integral from 0 to I into an integral from -\ to I. We set r =
Then dx - 2 dr, and (I I) WIth /I ~ 3 and the above values of the nodes and the coefficients yields
!U +
I).
SEC. 19.5
827
Numeric Integration and Differentiation
f~Xp(_X2)dX = ~ {
o
=
~[%exp(-±(I -
exp (- ± (I +
1)2)
dl
-1
/fr)
+
%exp(-~) + %exP(-±(1 + Hf)] =0.746815
(exact to liD: 0.746 825), which is almost as accurate as the Simpson result obtained in Example 3 with a much
larger number of arithmetic operations. With 3 function values (as in this example) and Simpson's rule we would
•
get ~(l + 4e- O.25 + e -1) = 0.747 180. with an error over 30 times that of the Gauss integration.
E X AMP L E 8
Gauss Integration Formula with n
=
4 and 5
4
Integrate f(x) = !7TX cos !7TX from x = 0 to 2 by Gau". Compare with the adaptive integration in Example 6
and comment.
Solutioll.
x =
1
+ 1 gives f(t) =
J = Ad1 + ... + A4f4
!7T(t
+ 1)4 cos (!7T(t + 1». as needed in (11). For 11 = 4 we calculate (6S)
= A 1{f1
+
f4)
+
A 2 (f2
+ i3)
= 0.347855(0.000290309 + 1.02570) + 0.652145(0.129464
+ 1.25459) = 1.25950.
The error is 0.00003 because J = 1.25953 (6S). Calculating with lOS and 11 = 4 gives the same result; so the
error is due to the formula. not rounding. For Il = 5 and lOS we get J = 1.25952 6185, too large by the amount
0.000000250 because J = 1.259525935 (lOS). The accuracy is impressive. particularly if we compare the
•
dmount of work with that in Example 6.
Gauss integration is of considerable practical importance. Whenever the integrand f is
given by a formula (not just by a table of numbers) or when experimental measurements
can be set at times tj (or whatever t represents) shown in Table 19.7 or in Ref. [GRl],
then the great accuracy of Gauss integration outweighs the disadvantage of the complicated
tj and Aj (which may have to be stored). Also, Gauss coefficients Aj are positive for all
n, in contrast with some of the Newton-Cotes coefficients for larger 11.
Of course, there are frequent applications with equally spaced nodes, so that Gauss
integration does not apply (or has no great advantage if one first has to get the tj in (11)
by interpolation).
Since the endpoints -1 and 1 of the interval of integration in ell) are not zeros of Pn'
they do not occur among to, ... , tn. and the Gauss fonnula (11) is called. therefore, an
open formula, in contrast with a closed formula, in which the endpoints of the interval
of integration are to and tn- [For example. (2) and (7) are closed formulas.]
Numeric Differentiation
Numeric differentiation is the computation of values of the derivative of a function f
from given values of f. Numeric differentiation should be avoided whenever possible,
because, whereas integration is a smoothing process and is not affected much by small
inaccuracies in function values, differentiation tends to make matters rough and generally
gives values of f' much le~s accurate than those of f-remember that the derivative is
the limit of the difference quotient. and in the latter you u!>ually have a small difference
oflarge quantities that you then divide by a small quantity. However, the formulas to be
obtained will be basic in the numeric solution of differential equations.
We use the notations fi = t' (x), f;' = f"exj), etc., and may obtain rough approximation
formulas for derivatives by remembering that
t' (x)
= lim f(x
h~O
This suggests
+
11) - f(x)
h
CHAP. 19
828
Numerics in General
t' -
(12)
fl - fo
8f1/2
h
1/2 -
h
Similarly, for the second derivative we obtain
etc.
(13)
More accurate approximations are obtained b) differentiating suitable Lagrange
polynomials. Differentiating (6) and remembering that the denominators in (6) are 2h2,
-h 2 , 2h 2 , we have
Evaluating this at
(14)
Xo, Xl> X2,
we obtain the "three-point formula,,"
I
(a)
f~
=
211 (-3fo
(b)
f~
=
2h (-fo
(c)
f2
,
I
1
= -21z
(fo -
+ 4f1
+
- f2),
f2)'
4fI +
3f2)'
Applying the same idea to the Lagrange polynomial P4(X), we obtain similar formulas,
in particular.
(15)
Some examples and further formulas are included in the problem set as well as in
Ref. [E5] listed in App. I.
1. (Rectangular rule) Evaluate the integral in Example
I by the rectangular rule (1) with a subinterval of length
0.1.
2. Derive a formula for lower and upper bounds for the
rectangular rule and apply it to Prob. I.
!3-8!
TRAPEZOIDAL AND SIMPSON'S RULES
Evaluate the integrals numelically as indicated and
determine the error by using an integration formula known
from calculus.
F(x) =
J
x
1
dx*
-
x*
H(x) =
.
G(x)
IX
0
1~\:*e-~*
dx*
o
dx*
2
cos x* '
3. F(2l by (2).
Il
=
10
4. F(2) by (7), n = [0
5. Gn) by (2).11 = 10
6. G(I) by (7),11 = 10
7. H(4) by (2),
Il
10
8. H(4) by (7),
Il
= 10
!9-121
HALVING
Estimate the error by halving.
9. In Prob. 5
10. In Prob. 6
11. In Prob. 7
12. In Prob. 8
Chapter 19 Review Questions and Problems
829
NON ELEMENTARY INTEGRALS
If IE3d ~ TOL, stop. The result is is3 = 132 + E32 .
(Why does 24 = 16 come in?) Show that we obtain
1"32 = -0.000266, so that we can stop. Arrange your
1- and E-values in a kind of "difference table."
113-191
The following integrals cannot be evaluated by the usual
methods of calculus. Evaluate them as indicated.
Si(x) =
Sex)
I
sinx*
x
o
dx~'.
- - .x~
= {'Sin (X*2) dx*.
C(x)
= {"cos (X*2) dx*
o
0
Si(x) is the sine integral. S(x) and C(x) are the Fresnel
integrals. (See App. 3.1.)
en.
13. SiO) by
II = 5. II = 10
14. Using the values in Prob. 13. obtain a better value for
Si(l). Hint. Use (5).
15. Si(l) by (7), 2111 = 2. 2m = 4
16. Obtain a better value in Prob. 15. Hint. Use (10).
17. Si(l) by (7). 2m = 10
18. S(1.25) by (7). 2111 = 10
19. C( 1.25) by (7). 2111 = 10
If IE3d were greater than TOL, you would have [0
go on and calculate in the next step 141 from (1) with
h = ~: then
143
=
142
+
1"42
with
q2
=
20. (Stability) Prove that the trapezoidal rule is stable with
respect to rounding.
144
=
143
+
E43
with
E43
= 63
142
=
141
+
I
with
E41
1"41
=
"3 (141
-
1 31 )
I
15 (142 -
1 32 )
I
(J43 -
is3)
where 63 = 26 - 1. (How does this come in?)
Apply the Romberg method to the integral of
f( 1:) = ~7TX4 cos !7TX from x = 0 to 2 with TOL = 10-4 .
121-241 GAUSS INTEGRATION
Integrate by (11) with II = 5:
21. IIx from I to 3
22. co~ x from 0 to!7T
23. e-x" from 0 to I
DIFFERENTIATION
24. sin <x 2 ) from 0 to 1.25
25. (Given TOL) Find the smallest 11 in computing the
integral of 1Ix from I to 2 for which 50-accuracy is
guaranteed (a) by (4) in the use of (2). (b) by (9) in the
use of (7). Compare and comment.
26. TEAM PROJECT. Romberg Integration (W.
Romberg. Norske Videllskab. Trolldheil11, FfJrh. 28,
Nr. 7, 1955). This method uses the trapezoidal rule and
gains precision stepwise by halving h and adding an
error estimate. Do this for the integral of f(x) = e- X
ti'om x = 0 to x = 2 with TOL = 10-3 , as follows.
Step 1. Apply the trapezoidal rule (2) with h = 2
(hence n = I) to get an approximation In. Halve II and
use (2) to gel 121 and an error estimate
27. Consider f(x) = X4 for Xo = 0, Xl = 0.2, X2 = 0.4,
X3 = 0.6, X4 = 0.8. Calculate f~ from (14a), (14b),
(14c), (15). Determine the errors. Compare and
comment.
28. A "four-point formula" for the derivative is
Apply it to f(x) = X4 with Xl' ... , X4 as in Prob. 27,
determine the error. and compare it with that in the case
of (15).
29. The derivative f' (x) can also be approximated in terms
of first-order and higher order differences (see
Sec. 19.3):
1
E21 =
22 _
I
(121 -
1u)·
If IE211 ~ TOL, stop. The result is 122 = 121 + E21'
Step 2. Show that E21 = -0.066596, hence
1"211 > TOL and go on. Use (2) with hl4 to get 131 and
add to it the error estimate 1"31 = i(131 - 1 21 ) to get
the better 132 = 131 + 1"31- Calculate
E32
=
I
24 _
I
I
(132 -
1 22)
= 15
(132 -
1 22),
+
"3I .1.3 fo
-
4"1 ...\ 4 fo + - . .. ) .
Compute t' (0.4) in Prob. 27 from this fOlmula usino
differences up to and including first order, ~econd
order, third order, fourth order.
30. Derive the formula in Prob. 29 from (14) in Sec. 19.3.
830
CHAP. 19
Numerics in General
==::==
: ::&W :::....
1. What is a numeric method? How has the computer
influenced numeric methods?
2. What is floating-point representation of nwnbers?
Overflow and underflow?
3. How do error and relative enor behave under addition?
Under multiplication?
4. Why are roundoff errors important? State the rounding
rules.
5. What is an algorithm"! Which of its properties are
important in software implementation?
6. Why is the selection of a good method at least as
important on a large computer as it is on a small one?
7. Explain methods for solving equations, in particular
fixed-point iteration and its convergence.
8. Can the Newton (-Raphson) method diverge? Is it fast?
Same questions for the bisection method.
S T ION SAN D PRO B L EMS
23. What is the relative error of l1a in terms of that of a?
24. Show that the relative error of
25. Compute the solution of x 5 = x + 0.2 near J. = 0 by
transforming the equation into the f01Tl1 x = g(x) and
starting from Xo = O. (Use 6S.)
26. Solve cos x = x by iteration (6S, Xo = I), writing it as
x = (O.74x + cosx)/1.74. obtainingx4 = 0.739085
(exact to (is!). Why does this converge so rapidly?
27. Solve X4 - x 3 - 2x - 34 = 0 by Newton's method
with Xo = 3 and 6S accuracy.
28. Solve cos x - x
22. Answer the question in Prob. 21 for the difference
4.81 - 11.752.
0 by the method of false position.
.fO.Q) = 3.00000
= 1.98007
f(l.2)
f(1.4) = 2.92106
f( 1.6) = 1.111534
10. What do you remember about errors in polynomial
imerpolation?
13. In what sense is Gau~s integration optimal? Explain
details.
14. What does adaptive imegration mean? Why is it useful?
15. Why is numeric differentiation generally more delicate
than numeric integration?
16. Write -0.35287. 1274.799, -0.00614. 14.9482. 113,
8517 in floating-point form with 5S (5 significant digits,
properly rounded).
17. Compute (5.346 - 3.644)/(3.454 - 3.055) as given and
then rounded stepwise to 3S. 2S. I S. ("Stepwise" means
rounding the four rounded numbers. not the given ones.)
Comment on your results.
18. Compute 0.29731/(4.1132 - 4.0872) with the numbers
as given and then rounded stepwise (that is. rounding
the rounded numbers) to 4S. 3S, 2S. Comment.
19. Solve x 2 - 50x + 1 = 0 by (6) and by (7) in Sec. 19.1,
using 5S in the computation. Compare and comment.
20. Solve x 2 - 100x + 4 = 0 by (6) and by (7) in Sec.
19.1, using 5S in the computation. Compare and comment.
21. Let 4.81 and 12.752 be correctly rounded to the number
of digits shown. Determine the smallest interval in
which the sum (using the true instead of the rounded
values) must lie.
=
29. Compute f(1.28) from
9. What is the advamage of Newton's interpolation
formulas over Lagrange's?
11. What is spline interpolation? Tts advantage over
polynomial interpolation?
12. List and compare numeric integration methods. When
would you appl} them'!
a2 is about twice that of
a.
f( 1.8) = 2.69671
f(l.O) = 2.54030
by linear interpolation. By quadratic interpolation. using
f( 1.2), fOA-), IO.6).
30. Find the cubic spline for the data
f(-I) = 3
f(1) = I
f(3) = 23
f(5) = -1-5
ko = k3 = 3.
31. Compute the integral of X3 from 0 to I by the trapezoidal
rule with 11 = 5. What error bounds are obtained from
(4) in Sec. 19.5? What is the actual error of the result?
Why is this result larger than the exact value?
32. Compute the integral of cos (X2) from 0 to I by
Simpson's rule with 2111 = 2 and 2171 = 4 and estimate
the error by (10) in Sec. 19.5. (This is the Fresnel
integral (38) in App. 3.1 with x = 1.)
33. Compute the integral of cos x from 0 to
three-eights rule
f
the
3
b
a
-!rr by
f(x) dx =
"8 hUo + 3.fl + 3f2 + f3)
-
-
1
80
.
(b - a)h 4 !'IV)(t)
and give error bounds; here a ~ f ~ band
Xj = a + (b - a)jI3, j = 0, ... , 3.
831
Summary of Chapter 19
,
....
....._......
-
..
II
•
Numerics in General
[n this chapter we discussed concepts that are relevant throughout numeric work as
a whole and methods of a general nature, as opposed to methods for linear algebra
(Chap. 20) or differential equations (Chap. 21).
In scientific computations we use the floating-point representation of numbers
(Sec. 19.1); fixed-point representation is less suitable in most cases.
Numeric methods give approximate values a of quantities. The error E of a is
(Sec. 19.1)
(1)
where a is the exact value. The relatil'e error of Ii is E/a. Errors arise from rounding,
inaccuracy of measured values, truncation (that is. replacement of integrals by sums,
series by partial sums), and so on.
An algorithm is called numerically stable if small changes in the initial data give
only cOiTespondingly small changes in the final results. Unstable algorithms are
generally useless because errors may become so large that results will be very
inaccurate. Numeric instability of algorithms must not be confused with
mathematical instability of problems ("ill-conditioned problems," Sec. 19.2).
Fixed-point iteration is a method for solving equations f(x) = 0 in which the
equation is first transformed algebraically to x = g(x), an initial guess Xo for the
solution is made, and then approximations X10 X2 • • • • , are successively computed
by iteration from (see Sec. 19.2)
(2)
Xn+l
=
g(xn )
Newton's method for solving equations f(x)
(3)
X,,+l
= Xn
-
(n
= O. 1. ... ).
= 0 is an iteration
(Sec. 19.2).
Here Xn+1 is the x-intercept of the tangent of the curve y = f(x) at the point X n •
This method is of second order (Theorem 2, Sec. 19.2). If we replace f' in (3) by
a difference quotient (geometrically: we replace the tangent by a secant). we obtain
the secant method; see (10) in Sec. 19.2. For the bisection method (which converges
slowly) and the method offalse position, see Problem Set 19.2.
Polynomial interpolation means the detelTnination of a polynomial P>l(x) such
that Pn(Xj) = Ii, where.i = 0, ... , 11 and (xo, f 0), ••. , (xn , fn) are measured or
observed values, values of a function, etc. Pn(x) is called an intelpo/afioll poZvl1omial.
For given data. Pn(X) of degree 11 (or less) is unique. However. it can be written in
different forms, notably in Lagrange's form (4). Sec. 19.3, or in Newton's divided
difference form (10). Sec. 19.3, which requires fewer operations. For regularly
spaced xo, Xl = Xo + h, .... xn = Xo + l1h the latter becomes Newton's forward
difference formula (formula (14) in Sec. 19.3)
832
CHAP. 19
Numerics in General
f(x) = Pn(X) = fo
(4)
where r
=
+
rt:.fo
+ ... +
r(r - 1) ... (r - n
+
I)
t:.nfo
n!
(x - xo)lh and the forward differences are t:.fj = f j +1 - fj and
(k
=
2,3, .. ').
A similar formula is Newton's backward difference illterpolationfo1711ula (formula
(18) in Sec. 19.3).
Interpolation polynomials may become numerically unstable as 11 increases, and
instead of interpolating and approximating by a single high-degree polynomial it is
preferable to use a cubic spline g(x). that is. a twice continuously differentiable
interpolation function [thus. g(Xj) = fjJ, which in each subinterval Xj ~ x ~ Xj+1
consists of a cubic polynomial qj(x): see Sec. 19.4.
Simpson's rule of numeric integration is [see (7), Sec. 19.5]
h
b
(5)
fa
f(x) dx =
3"
(fo
+
4fL
+
2f2
+ 4j3 + ... +
2f2711-2
+ ~f2m-l +
f2711)
with equally spaced nodes Xj = Xo + jh. j = 1.... ,2m, h = (b - a)/(2m). and
f j = f(xj)' It is simple but accurate enough for many applications. Its degree of
precision is DP = 3 because the error (8), Sec. 19.5. involves h4. A more practical
error estimate is (10), Sec. 19.5.
obtained by first computing with step h, then with step /1/2, and then taking III 5 of
the difference of the results.
Simpson's rule is the most important of the Newton-Cotes formulas, which are
obtained by integrating Lagrange interpolation polynomials, linear ones for
the trapezoidal rule (2), Sec. 19.5, quadratic for Simpson's mle. cubic for the
three-eights rule (see the Chap. 19 Review Problems). etc.
Adaptive integration (Sec. 19.5, Example 6) is integration that adjusts
("adapts") the step (automatically) to the variability of f(x).
Romberg integration (Team Project 26, Problem Set 19.5) starts from the
trapezoidal rule (2). Sec. 19.5. with h. h12, h/4, etc. and improves results by
systematically adding error estimates.
Gauss integration (II), Sec. 19.5, is important because of its great accuracy
(DP = 217 - 1, compared to Newton-Cotes's DP = 11 - 1 or 11). This is achieved
by an optimal I;hoice of the nodes, which are not equally spaced; see Table 19.7,
Sec. 19.5.
Numeric differentiation is discussed at the end of Sec. 19.5. (Its main application
(to differential equations) follows in Chap. 21.)
,
CHAPTER
J
20
Numeric Linear Algebra
In this chapter we explain "orne of the most important numeric methods for solving linear
systems of equations (Secs. 20.1-20.4), for fitting straight lines or parabolas (Sec. 20.5),
and for matrix eigenvalue problems (Secs. 20.6-20.9). These methods are of considerable
practical importance because many problems in engineering, statistics, and elsewhere lead
to mathematical models whose solution requires methods of numeric linear algebra.
COM MEN T. This chapter is independent of Chap. 19 and can be studied immediately
after Chap. 7 or 8.
Prerequisite: Secs. 7.1. 7.2, 8.1.
Sections that may be omitted in a shorter course: 20.4. 20.5. 20.9
References and Answers to Problems: App. I Part E. App. 2
20.1
Linear Systems: Gauss Elimination
A linear system of n equations in
E I , . . . , En of the form
11
unknowns
Xl ••• ,
Xn is a set of equations
(1)
where the coefficients ajk and the bj are given numbers. The system is called homogeneous
if all the bj are zero; otherwise it is called nonhomogeneous. Usmg matrix multiplication
(Sec. 7.2), we can write (1) as a single vector equation
(2)
Ax
where the coefficient matrix A
A=
=
=b
[ajk] is the 11 X 11 matrix
au
al2
al n
a21
a22
a2n
and
anI
an2
ann
bl
Xl
x=
and
Xn
and b =
bn
833
834
CHAP. 20
Numeric Linear Algebra
are column vectors. The following matrix
system (I):
A=
[A
b]
A is
called the augmented matrix of the
=
A solution of (I) is a set of numbers Xl' • . • , Xn that satisfy all the II equations, and a
solution vector of (1) is a vector x whose components constitute a solution of (1).
The method of solving such a system by determinants (Cramer's rule in Sec. 7.7) is
not practical, even with efficient methods for evaluating the determinants.
A practical method for the solution of a linear system is the so-called Gal/ss eliminatioll,
which we shall now discuss (proceeding illdependently of Sec. 7.3).
Gauss Elimination
This standard method for solving linear systems (I) is a systematic process of elimination
that reduces (1) to "triangular form" because the system can then be easily solved by
"back substitution." For instance, a triangular system is
and back substitution gives
X3
= 3/6 = 112 from the third equation, then
from the second equation, and finally from the first equation
How do we reduce a given system (I) to triangular form? In the first step we elimil1ate
from equation E2 to En in (I). We do this by adding (or subtracting) suitable multiples
of E1 from equations E2 , ••• , En and taking the resulting equations, call them E~, ... ,
E~ as the new equations. The first equation, E 1 , is called the pivot equation in this step,
and llU is called the pivot. This equation is left unaltered. In the second step we take the
new second equation E~ (which no longer contains Xl) as the pivot equation and use it to
eliminate X2 from E; to E~. And so on. After Il - I steps this gives a triangular system
that can be solved by back substitution as just shown. In this way we obtain precisely all
solutions of the given system (as proved in Sec. 7.3).
The pivot llkk (in step k) must be different from zero and should be large in absolute
value, to avoid roundoff magnification by the multiplication in the elimination. For
this we choose as our pivot equation one that has the absolutely largest ajk in column
k on or below the main diagonal (actually, the uppermost if there are several such
equations). This popular method is called partial pivoting. It is used in CASs (e.g.,
in Maple).
Xl
SEC 20.1
835
Linear Systems: Gauss Elimination
Partial pivoting distinguishes it from total pivoting, which involves both row and
column interchanges but is hardly used in practice.
Let us illustrate this method with a simple example.
EXAMPLE 1
Gauss Elimination. Partial Pivoting
Solve the system
Solutioll.
We must pivot since El ha, no xl-term. In Column 1. equation E3 ha, the large,t coefficient.
Hence we interchange El and E3,
6Xl + 2x2 + 8x3 =
3X1
26
+ 5X2 + 2X3
=
8.1:2 + 2X3
~
8
-7.
Step 1. Elimillatioll of Xl
It would suftlce to show the augmented matrix and operate on it. We show both the equations and the augmented
matrix. In the first step. the first equation is the pivot equation. Thus
Pivot 6----+(§i)+ 2X2 + 8x3
=
26
Eliminate - - ' > ~ + 5x2 + 2r3 =
8X2 + 2x3
8
= -7
[:
2
8
5
2
8
2
':]
-7
To eliminate Xl from the other equations (here. from the second equation). do:
Subtract 3/6
~
112 times the pivot equation from the second equation.
The result is
ntl
+
+
=
26
2
8
4X2 - 2'-3 = -5
4
-2
-5
8x2 + 2'3 = -7
8
2
-7
2x2
8x3
26]
Step 1. Elimillatioll of X2
The largest coefficient in Column 2 is 8. Hence we take the nell' third equation as the pivot equation, interchanging
equations 2 and 3,
26
2
8
Pivot 8----+
~+ 2x3 =-7
8
2
-7
Eliminate
I§l- 2X3 =
4
-2
-5
6Xl + 2x2 + 8x3 =
--'>
-5
[:
26]
To eliminate .1:2 from the third equation. do:
Subtract 112 times the pivot equation from the third equation.
The resulting triangular system i, shown below. Thi, is the end of the forward elimination. Now comes the back
substitution.
Back substitutioll. Determillatioll of X3, X2, Xl
The trIangular system obtained in Step 2 is
2
8
8
2
-7
o
-3
_.2
26]
2
836
CHAP. 20
Numeric Linear Algebra
From this system. taking the last equation, then the second equation. and finally the fust equation. we compute
the solution
x 3 -- 12
X2
= ~(-7 - 2X3) = -I
Xl
= ~(26
- 2~2 -
8x3)
= 4.
This agrees with the values given above. before the beginning of the example.
•
The general algorithm for the Gauss elimination is shown in Table 20.1. To help explain
the algorithm, we have numbered some of its lines. bj is denoted by aj,n+l' for uniformity.
In lines I and 2 we look for a possible pivot. [For k = I we can always find one; otherwise
Xl would not occur in (l ).] In line 2 we do pivoting if necessary, picking an ajk of greatest
absolute value (the one with the smallest j if there are several) and interchange the
corresponding rows. If lakkl is greatest, we do no pivoting. 11ljk in line 3 suggests multiplier,
since these are the factors by which we have [0 multiply the pivor equation E~ in Step k
before subtracting it from an equation Ef below E~' from which we want to eliminate Xk'
Here we have written E~ and E/ to indicate that after Step I these are no longer the given
equations in (1), but these underwent a change in each step, as indicated in line 4.
Accordingly, ajk etc. in lines 1-4 refer to the most recent equations, andj ~ k in line I
indicates that we leave un[Ouched all the equations that have served as pivot equations in
previous steps. For p = k in line 4 we get 0 on the right. as it should be in the elimination.
In line 5, if the last equation in the triangular system is 0 = b~ *- 0, we have no
= 0, we have no unique solution because we then have fewer
solution. If it is 0 =
equations than unknowns.
b:
E X AMP L E 2
Gauss Elimination in Table 20.1, Sample Computation
In Example 1 we had all = O. so that pivoting wa, necessary. The greatest coefficient in Column I was a3l'
Thus J = 3 in line 2. and we interchanged El and E 3 . Then in lines 3 and 4 we computed 11121 = 3/6 = ~ and
G22 =
5 - ~ . 2 = 4.
G23 =
2 - ~. 8 = -2.
G24
= 8 - ~ . 26 = -5.
and then 11131 = 0/6 = O. so that the third equation 8X2 + 2X3 = -7 did not change in Step I. In Step 2
= 2) we had 8 as the greatest coefficient in Column 2. hence J = 3. We interchanged equations 2 and 3.
computed 11132 = -4/8 = - ! in line 4. and the a33 = -2 - !·2 = -3. a34 = -5 - ~(-7) = -~. This
produced the triangular fOim used in the back substitution.
•
(k
If akk = 0 in Step k, we must pivot. If lakkl is small. we should pivot because of roundoff
error magnification that may seriously affect accuracy or even produce nonsensical re~ults.
E X AMP L E 3
Difficulty with Small Pivots
The solution of the 'ystem
0.0004Xl + 1.402x2 = 1.406
0.4003xl - 1.502x2 = 2.501
is Xl = 10'"'2 = 1. We solve this system by the Gauss elimination. using four-digit floating-point arithmetic.
(4D is for simplicity. Make an 8D-arithmetic example that shows the same.)
In
(a) Picking the first of the given equations as the pivot equation. we have to mUltiply this equation by
0.4003/0.0004 = 1001 and subtract the result from the second equation. obtaining
=
SEC. 20.1
837
Linear Systems: Gauss Elimination
Table 20.1
Gauss Elimination
ALGORITHM GAUSS
(A =
[ajd = LA
b])
This algorithm computes a unique solution x
=
[Xj] of the system (1) or indicates that
(1) has no unique solution.
INPUT:
Augmented n X (n
=
matrix
A=
[ajk]' where aj,n+l = bj
1. . . . , n - 1. do:
0 for all j
exists."' Stop
If ajk =
1
I)
Solution x = lXj] of (I) or message that the system (1) has no
unique solution
OCTPUT:
For k
+
~
k then OUTPUT "No unique solution
[Procedure completed unsuccessfully; A is singular]
2
Else exchange the contents of rows J and k of A with J the smallest
j ~ k such that lajkl is maximum in column k.
3
For j
k
=
+
I. .... n. do:
. _ ajk
1njk- - - akk
For p = k
4
I
+
ajp:
I. . . . , n + I. do:
= ajp - 1njk a kp
End
End
End
If ann = 0 then OUTPUT "No unique solution exists."
Stop
Else
5
6
[Start back substitution]
For i
7
=
n - 1. . . . . 1, do:
Xi =
....!...
(ai,n+l -
aii
i
aijXi)
j=i+l
End
OUTPUT x = [Xj]. Stop
End GAUSS
~ 1405x2 =
Hence x2
=
~ 1404/( ~ 1405)
Xl
This failure occurs because
large error in Xl'
=
~ 1404.
= 0.9993, and from the first equation.
1
00004 (l.406 ~ 1.402· 0.9993)
.
instead of Xl
U.oo5
=-=
0.0004
=
10. we get
12.5.
iani is small compared with ia12i. so that a small roundoff error in X2 leads to a
838
CHAP. 20
Numeric Linear Algebra
(b) Picking the second of the given equations as the pivot equation. we have
10
multiply this equation by
0.0004/0.4003 = 0.0009993 and subtract the result from the first equation. obtaining
1.404x2 = 1.404.
Hence .\"2 = I, and from the pivot equation xl = 10. This success occur~ because la2l1 is not very small
compared to la221. so that a small roundoff enOf in .\"2 would not lead to a large error in Xl' Indeed. for
instance. if we had the value x2 = 1.002. we would still have from the pivot equation the good value
Xl = (2.501 + 1.505)10.4003 = 10.01.
•
Error estimates for the Gauss elimination are discussed in Ref. [E5] listed in App. I.
Row scaling means the multiplication of each Row j by a suitable scaling factor ~J- It is
done in connection with partial pivoting to get more accurate solutions. Despite much
research (see Refs. [E9]. [E24] in App. I) and the proposition of several principles, scaling
is still not well understood. As a possibility, one can !>cale for pivot choice only (not in
the calculation, to avoid additional roundoff) and take as first pivot the entry aj1 for which
IlljlltlAjl is largest: here Aj is an entry of largest absolute value in Row j. Similarly in the
further steps of the Gauss elimination.
For instance, for the system
4.0000X1
+
14020.\"2
=
14060
0.4003.\"1 - 1.502.\"2 = 2.501
we might pick 4 as pivot, but dividing the first equation by 104 gives the system in Example
3, for which the second equation is a better pivot equation.
Operation Count
Quite generally, important factors in judging the quality of a numeric method are
Amount of storage
Amount of time (== number of operations)
Effect of roundoff elTOL
For the Gauss elimination, the operation count for a full matrix (a matrix with relatively
many nonzero entries) is as follows. In Step k we eliminate Xk from n - k equations. This
needs n - k divisions in computing the IIljk (line 3) and (n - k)(11 - k + 1) multiplications
and as many subtractions (both in line 4). Since we do 11 - 1 steps. k goes from I to
n - I and thus the total number of operations in this forward elimination is
n-1
fen)
=
L
n-1
(n -
k)
+2L
k~l
n-1
=L
s~l
(11 - k)(n - k
+
(write Il - k = s)
1)
k~l
n-1
S
+
2
L
s(s
+
1) = !(Il - 1)11
+ 1(11 2
-
1)n = 111
3
S=l
where 2n 3/3 is obtained by dropping lower powers of 11. We see that f(ll) grows about
proportional to /1 3 . We say that f(n) is of order n 3 and write
SEC. 20.1
839
Linear Systems: Gauss Elimination
where 0 suggests order. The general definition of 0 is as follows. We write
f(ll) = O(h(n))
if the quotient If(n)lh(Iz)1 remains bounded (does not trail off to infinity) as 11 ~ x. In
our present case, hen) = 11 3 and, indeed. f(n)ln 3 ~ 2/3 because the omitted telms divided
by 11 3 go to zero as n _ h .
In the back substitution of Xi we make 11 - i multiplications and as many subtractions,
as well as 1 division. Hence the number of operations in the back substitution is
n
b(ll) = 2
2: (11
-
i)
+ 11
= 2
2: s + n
=
n(n
+
I)
+
n =
/1
2
+
211 =
2
0(11 ).
5=1
We see that it grows more slowly than the number of operations in the forward elimination
of the Gauss algorithm, so that it is negligible for large systems because it is smaller by
a factor 11, approximately. For instance, if an operation takes 10- 9 sec, then the times
needed are:
Algorithm
n = 1000
Elimination
Back substitution
n=10000
0.7 sec
11 min
0.001 sec
0.1 sec
.P RO 8 E=E M- S E.E10:;-L
For applicatiolls of linear systems see Secs. 7.1 and 8.2.
11-31
5. 2X1
GEOMETRIC INTERPRETATION
6.\1
-
8x2 = -4
+
2x2
= 14
Solve graphically and explain geometrically.
1. 4.\1
+
X2 =
3.\1 - 5.\2
2.
=
-4.3
6. 25.38x1
-33.7
-7. 05x 1
1.820.\1 - 1.183x2 = 0
-12.74.\1
3.
+
7.2.\1 -21.6.\1
14-141
+
8.281.\2 = 0
3.5x2 =
10.5.\2 = -4R.5
6X2
X2
=
-3
= 30.60
=
4.30.\2
+
8.
5x 1
+
3X2
- 4x2
IOx1 - 6.\2
= 137.86
8X3
= -85.88
178.54
+
X3
15x2
=
2
+ 8X3 = -3
+
26x3
9. 4.\1 + IOx2 - 2X3
-Xl -
-8.50
13x3
13x1 - 8X2
GAUSS ELIMINATION
+
+
6X1
16.0
Solve the following linear systems by Gauss elimination.
with partial pivoting if necessary (but without scaling). Show
the intermediate steps. Check the result by substitution. If no
solution or more than one solution exists, give a reason.
4. 6.\"1
7.
15.48x2
+
=
=
3X3 =
0
-20
30
25x2 - 5X3 = -50
CHAP. 20
840
10.
Xl
+
2X2
lOx 1
+
X2
IOx2
11. 3.4x 1
-Xl
3X3
-II
+
X3
8
i-
2X3
=
(b) Gauss elimination and nonexistence. Apply the
Gauss elimination to the following two systems and
compare the calculations step by step. Explain why the
elimination fails if no solution exists.
2
6.l2x2 - 2. 72x3 =0
+
1.80X2
2.7x1 - 4.86x2
12.
Numeric Linear Algebra
3X2
+
+
0.80X3
-
2. 16x3=0
5X3
=
1.20736
+
6X3
=
1.6x2
-1.6
+ 4. 8x3
4.8x2 - 9.6x3
7.2x3
14.
4.4x2
+
32.0
+
7.2x4 = -78.0
+ 4.8x4
20.4
=
3.0x3 - 6.6x4
=
-4.65
+ 8.4X4
=
4.62
-4.35
- 7.6x3
+
3.0x4 =
5.97
15. CAS EXPERIMENT. Gauss Elimination. Write a
program for the Gauss eliminarion with pivoting.
Apply it to Probs. 11-14. Experiment with systems
whose coefficient determinant is small in absolute
value. Also investigate the perfonnance of your
program for larger systems of your choice. including
sparse systems.
2X2 - x3
5
9X1
+
5X2 - X3
13
Xl
+
+
3
X2
X3 =
(c) Zero determinant. Why maya computer program
give you the result that a homogeneous linear system
has only the trivial solution although you know its
coefficient determinant to be zero?
(d) Pivoting. Solve System lA) (below) by the Gauss
elimination first without pivoting. Show that for any
fixed machine word length and sufficiently small E > 0
the computer gives X2 = I and then Xl = O. What is
the exact solution? Its limit as E ~ O? Then solve the
system by the Gauss elimination with pivoting.
Compare and comment.
(e) Pivoting. Solve System (H) by the Gauss
elimination and three-digit rounding arithmetic.
choosing (i) the first equation, (ii) the second equation
as pivot equation. (Remember to round to 3S after each
operation before doing the next, just as would be done
on a computer!) Then use four-digit rounding arithmetic
in those two calculations. Compare and comment.
2
16. TEAM PROJECT. Linear S~'stems and Gauss
Elimination. (a) Existence and uniqueness. Find a and
b such that aX1 + X2 = b, Xl + X2 = 3 has (i) a unique
solution. (ii) infinitely many solutions, (iii) no solutions.
20.2
+
4X1
X2
-0.329193
13. 6.4x 1 + 3.2x2
3.2x1 -
3
+
-2.34066
3X1 - 4X2
5x1
0
+ X3
Xl
(E)
-4.61
6.21x1
+
3.35x2 = -7.19
Linear Systems: LU-Factorization,
Matrix Inversion
We continue our discussion of numeric methods for solving linear systems of n equations
in 11 unknowns Xl' ••• , X n ,
(1)
Ax
=
b
where A = [ajk] is the 11 X 11 coefficient matrix and x T = [Xl' .. xn] and b T = [hI' .. bnJ.
We present three related methods that are modifications of the Gauss elimination, which
SEC. 20.2
841
Linear Systems: LU-Factorization, Matrix Inversion
require fewer arithmetic operations. They are named after Doolittle, Crout, and Cholesky
and use the idea of the LV-factorization of A, which we explain first.
An LV-factorization of a given square matrix A is of the form
(2)
A
LV
=
where L is lower triangular and U i, upper triallgular. For example,
A =
[2 3J =
8
[1 OJ [2 3J
LU =
5
4
I
0-7
It can be proved that for any nonsingular matrix (see Sec. 7.8) the rows can be reordered
so that the resulting matrix A has an LU-factorization (2) in which L turns out to be the
matrix of the multipliers Injk of the Gauss elimination, with main diagonal 1, ... , 1, and
V is the matrix of the triangular system at the end of the Gauss elimination. (See Ref.
[E5], pp. 155-156, listed in App. 1.)
The crucial idea now is that Land U in (2) can be computed directly, without solving
simultaneous equations (thus, without using the Gauss elimination). As a count shows. this
needs about n 3 /3 operations. about half as many as the Gauss elimination. which needs about
21l 3 /3 (see Sec. 20.1). And once we have (2), we can use it for solving Ax = b in two steps.
involving only about 11 2 operations. simply by noting that Ax = LUx = b may be written
(3)
(a)
Ly
=b
where
Ux
(b)
=
y
and solving first (3a) for y and then (3b) for x. Here we can require that L have main diagonal
1, ... ,las stated before; then this is called Doolittle's method. Both systems (3a) and
(3b) are triangular, so we can solve them as in the back substitution for the Gauss elimination.
A similar method, Crout's method, is obtained from (2) if V (instead of L) is required
to have main diagonal 1. ...• 1. In either case the factorization (2) is unique.
EXAMPLE 1
Doolittle's Method
Solve the system in Example I of Sec. 20.1 by Doolittle's method.
S Diu ti~Il.
The decomposition (2) is obtained from
5
a12
A
=
[ajk]
=
[ "n
a21
a22
{[23
= [ 30
""]
8
a31
a32
a33
6
2
'] ['
2
=
8
~]F
0
1Il21
nJ31
11132
U12
""]
1123
"22
0
li33
by determining the 111jk and liJ1<' using matrix multiplication. By going through A row by rov. we get successively
all =
3 = I . lIll =
11121 =
lill
a12
=
5
=
I • lI12
0
= 1112
a13
=2 =
I . 1/13 =
a23
= 2 =
111211113
1123
-I
li23
=2
=2.2 11132 =
+
li13
I •2 +
li33
842
CHAP. 20
Numeric Linear Algebra
Thus the factorization (2) is
:
:
:] =
6
2
8
[
LU [:1 0
=
2
-I
~] [:
: :] .
I
0
0
6
We first solve Ly = b, determining '"I = 8. then.\"2 = -7, then Y3 from 2Y1 - )"2 +
thu~ (note the interchange in b because of the interchange in A!)
[~
2
0
:] [::::] = [ _ : ] .
-I
26
1)3
Then we solve Ux = y. determining
Solution
x3 =
)"3 =
16 + 7 + Y3 = 26;
,=[:]
3/6. then x2, then xl' that is,
Solution
x=
[J
•
This agrees with the solution in Example I of Sec. 20.1.
Our formulas in Example I suggest that for general 11 the entries of the matrices
L = rmjk] (with main diagonal l. ...• 1 and mjk suggesting "multiplier") and U = [Ujk]
in the Doolittle method are computed from
k=l,"',n
j = 2,···, n
j-1
(4)
Ujk
=
lljk -
L
k
11ljs U s k
= j .... , n; j
~
2
s=l
j = k
Row Interchanges.
+ 1, ... , n; k
~
2.
Matrices, such as
[~
:J
or
[~ ~J
have no LU-factorization (try!). This indicates that for obtaining an LU-factorization. row
interchanges of A (and corresponding interchanges in b) may be necessary.
Cholesky's Method
'*
For a symmetric, positive definite matrix A (thUS A = AT, X TAx > 0 for all x
0) we
can in (2) even choose U = L T, thus Ujk = n1kj (but cannot impose conditions on the main
diagonal entries). For example,
SEC. 20.2
843
Linear Systems: LU-Factorization. Matrix Inversion
The popular method of solving Ax = b based on this factorization A = LLT is called
Cholesky's method. In terms of the entries of L = [ljkl the formulas for the factorization
are
= 2,"',11
j
j-l
(6)
L
Gjj -
Ij8
2
j = 2.···.11
8=1
= j + I, . . . , 11;
p
j
~
2.
If A is symmetric but not positive definite, this method could still be applied, but then
leads to a complex matrix L, so that the method becomes impractical.
E X AMP L E 1
Cholesky's Method
Solve by Cholesky"s method:
4.\"1 +
2X2
+ 14.\"3
2"1 + 17.\"2 14xl -
Solution.
5x2
=
14
5.\"3 = -!o1
+
83.\"3 =
155.
From (6) or from the form of the factorization
I~ ~:l
:
[
-5
14
o
: 1[~1
= [:::
83
133
131
o
0
we compute. in the given order.
111 =
~
=
2
121 =
a21
-
2
=
-
2
111
a31
= I
131 =
-
14
= -
111
=
2
7
This agrees with (5). We now have to solve Ly = b, that is.
0
[;
4
-3
0] ["'] [ 14]
~ ~~
As the second step. we have to solve Ux
[:
4
0
=
=
-~:~
LT X
Solution
.
y~ [-':J
= y, that is.
-;] [:} [-,;]
Solution
x~ [-:J
•
CHAP. 20
844
Numeric Linear Algebra
Stability of the Cholesky Factorization
THEOREM 1
The Clwlesky LLT -!actorizatio/1 is numerically stable (as defined in Sec. 19.1).
PRO 0 F
We have ajj = lj12 + Ij22 + ... + ljf by squaring the third formula in (6) and solving it
for ajj. Hence for allljk (note that ljk = 0 for k > j) we obtain (the inequality being trivial)
That is,
ljk
2
is bounded by an entry of A, which means stability against rounding.
•
Gauss-Jordan Elimination. Matrix Inversion
Another variant of the Gauss elimination is the Gauss-Jordan elimination, introduced
by W. Jordan in 1920, in which back substitution is avoided by additional computations
that reduce the matrix to diagonal form. instead of the triangular form in the Gauss
elimination. But this reduction from the Gauss triangular to the diagonal form requires
more operations than back substitution does, so that the method is disadvantageous for
solving systems Ax = h. But it may be used for matrix inversion, where the situation is
as follows.
The inverse of a nonsingular square matrix A may be determined in principle by solving
the 11 systems
(j
(7)
=
1, ... , 11)
where bj is the jth column of the 11 X 11 unit matrix.
However. it is preferable to produce A -1 by operating on the unit matrix I in the same
way as the Gauss-Jordan algorithm, reducing A to I. A typical illustrative example of this
method is given in Sec. 7.8.
..........- . ........
!'!I!.~.!I!.!IIII~III!.':!I!_~_lJ!!I!!y~~::P---
11-71
•• _ -
...,.
DOOLITTLE'S METHOD
Show the factorilation and solve by Doolittle's method.
1.
3Xl
+
15xl
+
2x2
=
15.2
Ilx2 = 77.3
6.
-9.88
3.3x3 = -16.54
0.5Xl - 3.0X2 +
-1.5Xl - 3.5x2 -
7.
3Xl
18xl
+
9X2
+
+ 48x2 +
9Xl -
27x2
1O.4x3 =
6X3 =
21.02
2.3
39x3 = 13.6
+ 42x3
=
4.5
8. TEAM PROJECT. Crout's method factorizes
A = LV, where L is lower ttiangular and V is upper
SEC. 20.3
845
Linear Systems: Solution by Iteration
triangular with diagonal entnes lljj = l,j = 1, ... ,11,
(a) Formulas. Obtain formulas for Crout's method
similar to (4).
20
(b) Examples. Solve Probs. I and 7 by Crout's method.
(e) Factor the following matrix by the Doolittle.
Crout. and Cholesky methods.
-4
l-:
14. CAS PROJECT. Cholesky's Method. (a) Write a
program for solving linear systems by Cho)esky's
method and apply it to Example 2 in the text, to Probs.
9-11, and to systems of your choice.
25
4
(b) Splines. Apply the factorization part of the
program to the following matrices (as they occur in (9),
Sec. 19.4 (with cJ = 1), in connection with splines).
(d) Give the formulas for factoring a tridiagonal
matrix by Crout's method.
(e) When can you obtain Crout's factorization from
Doolittle's by transposition?
19-131
4
CHOLESKY'S METHOD
[
Show the factorization and solve.
9.
9X1
+
6X2
+
12x3 =
6Xl
+
13x2
+
IIx3 = 118
12x1
+
IIx2
+
26x3 = 154
0.64X2
+
0.32x3 = 1.6
0.12x1 + 0.32x2
+
0.56x3 = 5.4
+
6x 1
+
8X1
+
12.
Xl
-Xl
-
+
6X2 +
8X3 =
0
34x2
+
52x3 =
-80
52x2
+
129x3 = -226
X2
+
3X3
5X2 -
+
3X1 - 5X2
+
19x3
+
2X1 - 2X2
+
3X3
+
20.3
2X4 =
4
o
2
INVERSE
Find the inverse by the Gauss-Jordan method. showing the
details.
16. In Prob 4.
17. In Prob. 5.
18. In Prob. 6.
19.
30
3X4 =
188
=
2
21x4
o
o
116-191
2x4 = -70
5X3
].
o
o
15. (Definiteness) Let A and B be positive definite 11 X /1
matrices. Are - A, A f. A + B, A - B positive definite?
0. 12x3 = 1.4
11. 4xl
4
87
+
10. 0. 04X 1
o
2
In Prob. 7.
20. (Rounding) For the following matrix A find det A.
What happens if you round off the given entries to (a)
5S, (b) 4S, (c) 3S, (d) 2S, (e) IS? What is the practical
implication of your work?
114
-3/28
Linear Systems: Solution by Iteration
The Gauss elimination and its variants in the last two sections belong to the direct methods
for solving linear systems of equations; these are methods that give solutions after an
amount of computation that can be specified in advance. In contrast. in an indirect or
iterative method we start from an approximation to the true solution and. if successful,
obtain better and better approximations from a computational cycle repeated as often as
may be necessary for achieving a required accuracy, so that the amount of arithmetic
depends upon the accuracy required and varies from case to case.
846
CHAP. 20
Numeric Linear Algebra
We apply iterative methods if the convergence is rapid (if matrices have large main
diagonal entries, as we shall see), so that we save operations compared to a direct method.
We also use iterative methods if a large system is sparse, that is, has very many zero
coefficients. so that one would waste space in storing zeros, for instance, 9995 zeros per
equation in a potential problem of 104 equations in 104 unknowns with typically only 5
nonzero terms per equation (more on this in Sec. 21.4).
Gauss-Seidel Iteration Method l
This is an iterative method of great practical importance. which we can simply explain in
terms of an example.
E X AMP L E 1
Gauss-Seidel Iteration
We consider the linear system
=
-0.25x I
+
50
- 0.25x4 = 50
(I)
+
-0.25.1'1
- 0.25.1'2 - 0.25.1'3
(Equations of this form arise in the numeric
in the fonn
~olution
+
of PDEs and in spline interpolation.) We write the system
0.25x2 + 0.25x3
Xl =
+ 50
X2
= 0.25.1'1
+ 0.25x4 + 50
X3
= 0.25xl
+ 0.25x4 + 25
(2)
0.25 t2 + 0.25x3
X4 =
+ 25.
These equations are now used for iteration: that is. we start from a (possibly poor) approximatIOn to the solution.
say xiO) = 100, x~O> = 100. x~O) = 100. x~O) = 100. and compute from (2) a perhaps better approximation
Use "old" values
("New" values here not yet available)
t
0.25x~o)
(3)
X 2(1)-
0.25x~O)
+ 50.00 = 100.00
0. 25x ll
i
0.25x~0l
+ 50.00 = 100.00
ill
0. 25xiO)
+ 25.00 = 75.00
0. 25x
lll X4
-
+
0.25x~1l + 0.25x~1l
+ 25.00 = 68.75
t
Use "new" values
These equations (3) are obtained from (2) by substituting on the right the //lost recellt approximation for each
unknown. In fact, correspondmg value~ replace previous ones as soon as they have been computed. so that in
IPHlLIPP LUDWIG VON SEIDEL (1821-1896), German mathematician. For Gauss see foomore 5 in
Sec. 5.4.
SEC. 20.3
847
Linear Systems: Solution by Iteration
the second and third equations we use xill (not xiOJ), and in the last equation of (3) we use x~l) and x~l) (not
x~Ol and x~OJ). Using the same principle. we obtain in the next step
0.25x~ll + 0.25x~ll
+ 50.00
X~2) = 0.25xi2 )
+ 0.25x~1l + 50.00
x~2) = 0.25xi
+
2
)
0.25x~1)
(2) _
X4
-
=
93.750
= 90.625
+ 25.00
= 65.625
+ 25.00
= M.062
Further steps give the values
Xl
X2
X3
X4
89.062
87.891
87.598
87.524
87.506
88.281
87.695
87.549
87.512
87.503
63.281
62.695
62.549
62.512
62.503
62.891
62.598
62.524
62.506
62.502
Hence convergence to the exact solution Xl
=
X2 =
87.5, x3
= X4
= 62.5 (verify!) seems rather fast.
•
An algorithm for the Gauss-Seidel iteration is shown on the next page. To obtain the
algorithm, let us derive the general formulas for this iteration.
We assume that ajj = I for j = 1, ... , n. (Note that this can be achieved if we can
realTange the equations so that no diagonal coefficient is zero; then we may divide each
equation by the corresponding diagonal coefficient.) We now write
(4)
A=I+L+U
(ajj
=
1)
where I is the n x n unit matrix and Land U are respectively lower and upper triangular
matrices with zero main diagonals. If we substitute (4) into Ax = b. we have
Ax = (I
+ L + U) x = b.
Taking Lx and Ux to the right, we obtain, since Ix = x,
(5)
x = b - Lx - Ux.
Remembering from (3) in Example I that below the main diagonal we took "new"
approximations and above the main diagonal "old" ones, we obtain from (5) the desired
iteration formulas
(6)
where x(m)
[_l}m)] is the mth approximation and x(m+ 1) = [X}m+ 1)] is the (m + I )st
approximation. In components this gives the formula in line I in Table 20.2. The matrix
A must satisfy Gjj =1= 0 for allj. In Table 20.2 our assumption ajj = I is no longer required,
but is automatically taken care of by the factor l/ajj in line 1.
848
CHAP. 20
Table 20.2
Numeric Linear Algebra
Gauss-Seidel Iteration
ALGORITHM GAUSS-SEIDEL (A. b,
XlV), E, N)
This algorithm computes a solution x of the system Ax = b given an initial approximation
x W), where A = [ajd is an 11 X 11 matrix with lljj =1= 0, j = I, ... , 11.
A, b, initial approximation
of iterations N
INPUT:
OUTPUT:
x(Q),
tolerance
E
> 0, maximum number
Approximate solution x(m) = [x7")] or failure message thal X(N) does
not satisfy the tolerance condition
For 111 = 0..... N - 1. do:
For j = L ... , /I, do:
r~m+ll =
•.1
-
I
(j-l
b. - ~
£.J
J
(l ..
2
€
£.J
Jk k
k-]
JJ
End
If m':IX IX)7n+ll - xj",JI <
It
~
(/. x(m+lJ -
(/.
x(mJ
Jk k
)
k~j+l
then OUTPUT
xcm+ll.
Stop
J
[Procedure completed successtittly 1
End
OUTPUT:
"No solution satisfying the tolerance condition obtained after N
iteration steps." Stop
[Procedure completed
U11.\"Ucces.~f~ttly]
End GAUSS-SEIDEL
Convergence and Matrix Norms
An iteration method for solving Ax = b is :-'did to converge for an initial x(V) if the
corresponding iterative sequence xW>, x(1), X(2 ), . . . converges to a solution of the given
system. Convergence depends on the relation between x(m) and x(m+lJ. To get this relation
for the Gauss-Seidel method, we use (6). We first have
(I
and by multiplying by (I
(7)
x{m+lJ
=
Cx{m)
+
+
+
L)x(m+lJ
=
b -
Ux(m)
L)-1 from the left,
(I
+ L) -1 b
where
C = -(I + L)
-1
U.
The Gauss-Seidel iteration converges for every x(O) if and only if all the eigenvalues
(Sec. 8.1) of the "iteration matrix" C = [CjkJ have absolute value less than l. (Proof in
Ref. [E5]. p. 191, listed in App. 1.)
CAUTION! If you want to get C, first divide the rows of A by au to have main diagonal
1, ... , 1. If the spectral radius of C (= maximum of those absolute values) is smalL
then the convergence is rapid.
Sufficient Convergence Condition.
A sufficient condition for convergence is
(8)
IICII
< 1
SEC. 20.3
849
Linear Systems: Solution by Iteration
Here
IIcll
is some matrix norm, such as
(9)
(Frobenius norm)
or the greatest of the sums of the
!cjkl in a colulIlll of C
(10)
IICIi =
n
max
k
or the greatest of the sums of the
2: !cjkl
(Column "sum" norm)
j=l
!cjkl in a row of C
n
(Row "sum" norm).
(11)
These are the most frequently used matrix norms in numerics.
In most cases the choice of one of these norms is a matter of computational convenience.
However, the following example shows that sometimes one of these norms is preferable
to the others.
E X AMP L E 1
Test of Convergence of the Gauss-Seidel Iteration
Test whether the
Gauss~"eidel
iteration converges for the system
2x+ y+ ;::=4
x + 2y
+ :: =
\" = 2 - ~x -~;::
written
4
:: = 2 - ~x
x+ y+2::=4
Solution.
-
h.
The decomposition (multiply the matrix by 112 - why?) is
In]
[l~
112
112
1/2
1
-U + L)-'U
~
1/2
=I+L+U=I
[0 °
+
112
112
1/2
112
~][:
1/2
In]
112
0
0
:] + [:
0
0
It shows that
c
~
0
- [
)12
-1/4
-112
We compute the Frobenius norm of e
lIell = ( -4I + -41
I
16
+-
I
1 + -9
+ -16+ -64
64
0
0
In]
-1/2
112
[:
0
114
-In]
-114
118
318
r
Y'2 = (-5064 2= 0884
< I
.
amI conclude from (8) that this Gauss-Seidel iteration converges. It is interesting that the other two norms would
permit no conclusion. as you ~hould verify. Of course. this points to the fact that (8) tS sufficient for convergence
rather than necessary.
•
Residual. Given a system Ax
defined by
(12)
b, the residual r of x with respect to this system is
r
= b - Ax.
CHAP. 20
850
Numeric Linear Algebra
Clearly, r = 0 if and only if x is a solution. Hence r *- 0 for an approximate solution. In
the Gauss-Seidel iteration, at each stage we modify or relax a component of an
approximate solution in order to reduce a component of r to zero. Hence the Gauss-Seidel
iteration belongs to a class of methods often called relaxation methods. More about the
residual follows in the next section.
Jacobi Iteration
The Gauss-Seidel iteration is a method of successive corrections because for each
component we successively replace an approximation of a component by a corresponding
new approximation as soon as the latter has been computed. An iteration method is called
a method of simultaneous corrections if no component of an approximation x(m) is used
until all the components of x(m) have been computed. A method of this type is the Jacobi
iteration, which is similar to the Gauss-Seidel iteration but involves not using improved
values until a step has been completed and then replacing x cm) by x(m+D at once, directly
before the beginning of the next step. Hence if we write Ax = b (with ajj = I as before!)
in the form x = b + (I - A)x, the Jacobi iteration in matrix notation is
x(7n-t-l) = b + (I - A)x(m)
(13)
(ajj
=
I).
This method converges for every choice of x(O) if and only if the spectral radius of 1 - A
is less than 1. It has recently gained greater practical interest since on parallel processors
alln equations can be solved simultaneously at each iteration step.
For Jacobi, see Sec. 10.3. For exercises. see the problem set.
- ....-----..... . ,
_~
_........
..-.
..---..
• . . . . . . . A _ _ _ . . . ~ ... • . . .
1. Verify the claim at the end of Example 2.
2. Show that for the system in Example 2 the Jacobi
iteration diverges. Him. Use eigenvalues.
13-81
-Xl
+
4X2 X2
Xl
+
X2
+
6X3
=
7.
-61.3
8.
185.8
4.
X2
5Xl
+
.1.'2
Xl
+
6X2
+
7X3
=
25.5
0
+
Xl
+
4X2 -
2x 1
+
3x2
+
X3
= -10.5
2X3
=
-2
8X3
=
39
21
X2
GAUSS-SEIDEL ITERATION
Do 5 steps, starring [rom Xo = [I
l]T and using 6S in
the computation. Hint. Make sure that you solve each
equation for the variable that has the largest coefficient
(why?). Show the details.
3.
6. 4Xl -
X3
= -45
+ 4x3 =
33
lOx]
+
X2
+
.\"3
6
Xl
+
lOx2
+
X3
6
x] +
x2
+
IOx3
6
4xl
+
5x3
=
12.5
Xl
+
6.\"2
+
2X3
=
18.5
8xl
+
2X2
+
X3
=
-1l.5
9. Apply the Gauss-Seidel iteration (3 steps) to the
system in Prob. 7, starting from (a) 0, 0, 0, (b) 10. 10,
10. Compare and comment.
10. In Prob. 7, compute C (a) if you solve the rust equation
for Xl. the second for .\"2' the third for X 3 , proving
convergence; (b) if you nonsensically solve the third
equation for Xlo the first for X 2 , the second for X3,
proving divergence.
SEC 20.4
851
Linear Systems: Ill-Conditioning, Norms
improvement of convergence. (Spectacular gains are
made with larger systems.)
11. CAS PROJECT. Gauss-Seidel Iteration. ta) Write
a program for Gauss-Seidel iteration.
(b) Apply the program to A(t)x = b, starting from
[0 0 O]T. where
A(t)
=
r:
t
J
b
=
rJ
For t = 0.2.0.5.0.8.0.9 determine the number of steps
to obtain the exact solution to 6S and the corresponding
spectral radius of C. Graph the number of steps and
the spectral radius as functions of t and comment.
(c) Successive overrelaxation (SOR). Show that by
adding and subtracting x Cm) on the right, formula (6)
can be written
XCm+l) =
X Cm )
+
b - LxCm + 1 )
-
(U
+
l)x Cm )
Anticipation of further corrections motivates the
introduction of an overrelaxation factor w > I to get
the SOR formula for Gauss-Seidel
X Cm + ll =
X Cm )
+ web -
112-151
Do 5 steps. starting from Xo = [I 1 IJT. Compare with
the Gauss-Seidel iteration. Which of the two seems to
converge faster? (Show the details of your work.)
12. The system in Prob. 6
13. The system in Prob. 5
14. The system in Prob. 8
15. Show convergence in Prob. 14 by verifying that I - A,
where A is the matrix in Prob. 14 with the rows divided
by the corresponding main diagonal entries. has the
eigenvalues -0.519589 and 0.259795 ::'::: 0.246603i.
I
116-20
NORMS
Compute the norms (9). (10), (II) for the following (square)
matrices. Comment on the reasons for greater or smaller
differences among the three numbers.
16. The matrix in Prob. 3
17. The matrix in Prob. 7
18. The matrix in Prob. 8
19.
Lx Cm + ll
(14)
20.4
r~
-k
(ajj = 1)
intended to give more rapid convergence. A
recommended value is w = 2/(1 + ,h - p). where p
is the spectral radius of C in (7). Apply SOR to the
matrix in (b) for t = 0.5 and 0.8 and notice the
JACOBI ITERATION
20.
r-:
17
-k
-2k
-:1
-k
2k
-3
-12
-:1
Linear Systems: Ill-Conditioning, Norms
One does not need much experience to observe that some systems Ax = b are good,
giving accurate solutions even under roundoff or coefficient inaccuracies. whereas others
are bad. so that these inaccuracies affect the solution strongly. We want to see what is
going on and whether or not we can "trust" a linear system. Let us first formulate the two
relevant concepts (ill- and well-conditioned) for general numeric work and then tum to
linear systems and matrices.
A computational problem is called ill-conditioned (or ill-posed) if "small" changes in
the data (the input) cause "large" changes in the solution (the output). On the other hand,
a problem is called well-conditioned (or well-posed) if "small" changes in the data cause
only "small" changes in the solution.
These concepts are qualitative. We would certainly regard a magnification of
inaccuracies by a factor 100 as "large," but could debate where to draw the line between
"large" and "small." depending on the kind of problem and on uur viewpoint. Double
precision may sometimes help, but if data are measured inaccurately, one should attempt
changing the mathematical setting of the problem to a well-conditioned one.
852
CHAP. 20
Numeric Linear Algebra
Let us now tum to linear systems. Figure 442 explains that ill-conditioning occurs if
and only if the two equations give two nearly parallel lines. so that their intersection point
(the solution of the system) moves substantially if we raise or lower a line just a little.
For larger systems the situation is similar in principle, although geometry no longer helps.
We shall see that we may regard ill-conditioning as an approach to singularity of the
matrix.
y
y
x
x
(a)
(b)
Fir 442. (a) Well-conditioned and (b) ill-conditioned
linear system of two equations in two unknowns
c X AMP L ElAn III-Conditioned System
You may verify that the system
0.9999x -
1.000ly = 1
xhas the solution x = 0.5. Y = -0.5,
wherea~
y=l
the syMem
0.9999x - 1.000ly
x-
=
1
)"=I+E
has the solution x = 0.5 + 5000.5 E. Y = -0.5 + 4999.5 E. Thi~ shows that the system is ill-conditioned because
a change on the right of magnitude E produces a change in the solution of magnitude 5000E. approximately. We
see that the lines given by the equations have nearly the same slope.
•
Well-conditioning can be asserted if the main diagonal entries of A have large absolute
values compared to those of the other entries. Similarly if A-I and A have maximum
entries of about the same absolute value.
Ill-conditioning is indicated if A-I has entries of large absolute value compared to those
of the solution (about 5000 in Example I) and if poor approximate solutions may still
produce small residuals.
Residual.
The residual r of an approximate solution
(2)
b is defined as
r = b - Ax.
(1)
Now b
xof Ax =
=
Ax, so that
r
=
A(x -
x).
Hence r is small if x has high accuracy, but the converse may be false:
SEC. 20.4
85}
Linear Systems: Ill-Conditioning, Norms
E X AMP L E 1
Inaccurate Approximate Solution with a Small Residual
The system
1.0001.\"1 +
Xl
+ I.OOOlx2
=
2.0001
has the exact solution xl = 1, -'"2 = I. Can you see this by inspection? The very inaccurate approximation
Xl = 2.0000, X2 = 0.0001 has the very small residual (to 4D)
2.0001 J [ 1.0001 I.OOOOJ [2.0000J [2.000IJ _ [2.0003J = [-O.0002J .
[2.0001 - 1.0000 1.0001 0.0001
2.0001
2.0001
0.0000
=
r =
From this. a naive per,on might draw the false conc\u,ion that the appfoximation should be accumte to 3 Of 4
decimals.
Our result is probably unexpected. but we shall see that it has to do with the fact that the system is
ill-conditioned.
•
Our goal is to show that ill-conditioning of a linear system and of its coefficient matrix
A can be measured by a number. the "condition number" K(A). Other measures for
ill-conditioning have also been proposed, but K(A) is probably the most widely used one.
K(A) is defined in terms of norm. a concept of great general interest throughout numerics
(and in modem mathematics in general!). We shall reach our goal in three steps. discussing
1. Vector norms
2. Matrix norms
3. Condition number
K
of a square matrix.
Vector Norms
A vector norm for column vectors x = [Xj] with 11 components (11 fixed) is a generalized
length or distance. It is denoted by II xII and is defined by four properties of the usual
length of vectors in three-dimensional space. namely.
(3)
(a)
IIxll is a nonnegative real number.
(b)
IIxll =0
(c)
IIkxll = Iklllxll
Cd)
Ilx
+
if and only if
yll ~ Ilxll
x
= O.
for all k.
+
Ilyll
(Triangle inequality).
If we use several norms, we label them by a subscript. Most important in connection with
computations is the p-Ilorm defined by
(4)
where p is a fixed number and p ~ 1. In practice, one usually takes p
third norm, Ilxlix (the latter as defined below), that is,
+ ... +
(5)
IIxlll = IX11
(6)
IIxl12
= YX12 + ... + xn2
(7)
IIxlix
= ~x
J
Ixjl
= 1 or 2 and. as a
IXnl
("Euclidean" or "12-norm")
("lx-norm").
854
CHAP. 20
Numeric Linear Algebra
For n = 3 the 12-norm is the usual length of a vector in three-dimensional space. The
lrnorm and lx-norm are generally more convenient in computation. But all three norms
are in common use.
E X AMP L E 3
Vector Norms
IfxT
=
[2
-3
0
L -4], then
IIxll! =
IIxll2 = V30, Ilxll oo =
10,
•
4.
In three-dimensional space, two points with position vectors x and xhave distance Ix - xl
from each other. For a linear system Ax = b, this suggests that we take IIx - xii as a
measure of inaccuraty and call it the distance between an exact and an approximate solution.
or the error of x.
Matrix Norm
If A is an n X n matrix and x any vector with n components, then Ax is a vector with 11
components. We now take a vector norm and consider Ilxll and IIAxll. One can prove (see
Ref. [E17]. p. 77, 92-93, listed in App. 1) that there is a number c (depending on A) such
that
IIAxll
(8)
~
cllxll
for all x.
Let x*- O. Then IIxll > 0 by (3b) and division gives IIAx11/1lx11 ~ c. We obtain the smallest
possible c valid for all x (=1= 0) by taking the maximum on the left. This smallest c is
called the matrix norm of A corresponding to the vector norm we picked and is denoted
by IIAII. Thus
(9)
IIAII
the maximum being taken over all x
(10)
=1=
=max
IIAxl1
(x =1= 0),
W
O. Alternatively [see (c) in Team Project 24],
IIAII
= max
IIxll=
1
IIAxll·
The maximum in (10) and thus also in (9) exists. And the name "matrix Ilorm" is
justified because IIAII satisfies (3) with x and y replaced by A and B. (Proofs in Ref.
[EI7] pp. 77,92-93.)
Note carefully that IIAII depends on the vector norm that we selected. In particular,
one can show that
for the lrnorm (5) one gets the column "sum" norm (10), Sec. 20.3,
for the lx-norm (7) one gets the row "sum" norm (11), Sec. 20.3.
By taking our best possible (our smallest)
(11)
IIAxl1
~
c = IIAII
we have from (8)
IIAllllxll·
This is the formula we shall need. Formula (9) also implies for two
Ref. [E17], p. 98)
(12)
thus
11
X n
matrices (see
SEC. 20.4
855
Linear Systems: III-Conditioning, Norms
See Refs. [E9] and [E 17] for other useful formulas on norms.
Before we go on. let us do a simple illustrative computation.
E X AMP L E 4
Matrix Norms
Compute the matrix norms of the coefficient matrix A in Example I and of its inver,e A-I. assuming that we
use (a) the lrvector norm, (b) the loo-vector norm.
Solutioll.
We use (4*). Sec. 7.!l, for the inverse and then (10) and (II) in Sec. 20.3. Thus
0.9999
-1.0001 ]
A-I =
A= [
1.0000 - 1.0000
-5000.0
[ -5000.0
5000.5J.
4999.5
(a) The Irvector norm gives the column "sum" norm (10). Sec. 20.3: from Column 2 we thus obtain
= 1-1.00011 + 1-1.00001 = 2.0001. Similarly. IIA -111 = 10000.
IIAII
(b) The lx-vector norm gives the row "sum" norm (11), Sec. 20.3: thus IIAII = 2, IIA -111 = 10000.5 from
Row 1. We notice that IIA -111 is surprisingly large. which makes the product IIAII IIA -111 large (20001). We
shall see below that this is typical of an ill-conditioned system.
•
Condition Number of a Matrix
We are now ready to introduce the key concept in our discussion of ill-conditioning, the
condition number K(A) of a (nonsingular) square matrix A, defined by
(13)
K(A)
= IIAII IIA -III·
The role of the condition number is seen from the following theorem.
THEOREM 1
Condition Number
A linear system of equations Ax = b and its matrix A whose condition number (13)
is small are well-cmzditioned. A large condition number indicates ill-conditioning.
PROOF
b = Ax and (11) give IIbll ::; IIAII IIxll. Let b =1= 0 and x =1= O. Then division by
Ilbll IIxll gives
1
IIxll
"All
lib II
-<--
(14)
=
Multiplying (2) r = Atx - x) by A-I from the left and interchanging sides, we have
x - x = A -Ir . Now (11) with A -1 and r instead of A and x yields
IIx - xII
=
IIA -Irll ~ IIA -III IIrll.
Division by IIxll [note that IIxll =1= 0 by (3b)] and use of (14) finally gives
(15)
M
IIx - xII < _1_
-1
< IIAII
-1
_
IIxll
= IIxll IIA IIlIrll = IIbll IIA II IIrll - K(A) IIbil .
Hence if K(A) is small, a small IIrll/llbll implies a small relative error IIx - xll/llxll, so
that the system is well-conditioned. However, this does not hold if K(A) is large; then a
small IIrli/llbil does not necessarily imply a small relative error IIx - xll/llxll
•
856
E X AMP L E 5
CHAP. 20
Numeric Linear Algebra
Condition Numbers. Gauss-Seidel Iteration
5
[
A =:
I]
has the
: :
Since A is symmetric. (10) anll
inver~e
(II) in Sec. 20.3 give the same condition number
We see that a linear system Ax = b "ith this A is well-conditioned.
9]T (confirm this).
For instance. if b = [14 0 28]T. the Gauss algorithm gives the solution x = [2 -5
Since the main diagonal entries of A are relatively large. we can expect reasonably good convergence of the
Gauss-Seidel iteration. Inlleed, qaning from. say, Xo = [I I qT. we obtain the first 8 steps (3D values)
EXAMPLE 6
Xl
X2
X3
1.000
2.400
1.630
1.870
1.967
1.993
1.998
2.000
2.000
1.000
-1.100
-3.882
-4.734
-4.942
-4.988
-4.997
-5.000
-5.000
1.000
6.950
8.534
8.900
8.979
8.996
8.999
9.000
9.000
•
III-Conditioned Linear System
Example 4 gives by (10) or (II). Sec. 20.3, for the matrix in Exmnple I the very large condition number
K(A) = 2.0001 . 10 000 ~ 2· 10 000.5 ~ 200001. This confirms that the sy,tem is very ill-conditioned.
Similarl) in Example 2. where by (4""). Sec. 7.8 and 6D-computation.
A-I = _1_ [
0.0001
-1.oo00J = [
1.0001
-1.0000
l.UOOI
5000.5
-5000.0
-5.000.0J
5000.5
so that (10), Sec. 20.3, gives a very large K(A), explaining the surprising result in Example 2.
K(A) =
(1.0001 + 1.0(00)(5000.5 + 5000.0) = 20002.
•
In practice, A-I will not be known, so that in computing the condition number K(A), one
must estimate IIA -111. A method for this (proposed in 1979) is explained in Ref. [E9]
listed in App. I.
Inaccurate Matrix Entries.
KlA) can be used for estimating the effect ox of an
inaccuracy oA of A (errors of measurements of the Qjb for instance). Instead of Ax = b
we then have
(A
+ oA)(x + ox)
Multiplying out and subtracting Ax
Aox
=
=
b.
b on both sides, we obtain
+ oA(x + ox) = O.
Multiplication by A-I from the left and taking the second term to the right gives
SEC. 20.4
857
Linear Systems: III-Conditioning, Norms
+
Applying (II) with A-I and vector oA(x
Applying (11) on the right, with
oA and x
ox) instead of A and x, we get
- ox instead of A and x, we obtain
lIoxll ~ IIA -111 IISAII IIx
+
Sxll .
Now IIA -111 = K(A)/IIAil by the definition of K(A), so that division by IIx + oxll shows
that the relative inaccuracy of x is related to that of A via the condition number by the
inequality
II ox II
W
(16)
= IIx
II ox II
+ oxll
<:
IIA-IIIII
=
~AII
b
_
-
A
K(
IIoAil
) IIAII .
Conclusion, If the system is well-conditioned, small inaccuracies IISAII/IIAil can have
only a small effect on the :>olution. However. in the case of ill-conditioning. if II oA 11/11 A II is
small. IISxll/llxll11lay be large.
Inaccurate Right Side. You may show that. similarly, when A is accurate, an inaccuracy
Sb of b causes an inaccuracy ox satisfying
(17)
Hence lIoxll/llxll must remain relatively small whenever K(A) is small.
E X AMP L E 7
Inaccuracies. Bounds (16) and (17)
If each of the nine entries of A in Example 5 i_ measured with an inaccuracy of 0.1. then 115A::
(16) gives
115xll
3 '0.1
lJ;f ~ 7.5' - 7-
= 0.321
115xll ~ 0.321 IIxll
thus
= 0.321
. 16
=
9' 0.1 and
= 5.14.
By experimentation you will find that the actual inaccuracy Iloxll is only about 30% of the bound 5.14. This is
typicaL
Similarly. if 5b = [0.1 0.1 O.I]T. then 115bll = 0.3 and Ilbll = 42 in Example 5. so that (17) gives
115xll
TxlI
~ 7.5'
0.3
42 = 0.0536.
hence
115xll ~ 0.0536' 16 = 0.857
but this bound is again much greater than the actual inaccuracy. which is about 0.15.
Further Comments on Condition Numbers.
may be helpful.
•
The following additional explainarions
1. There is no sharp dividing line hetween "well-conditioned" and "ill-conditioned,"
but generally the situation will get worse as we go from systems with small K(A) to systems
with larger K(A). Now always K(A) ~ I, so that values of 10 or 20 or so give no reason
for concern, whereas K(A) = 100, say, calls for caution. and systems such as those in
Examples I and 2 are extremely ill-conditioned.
CHAP. 20
858
Numeric Linear Algebra
2. If K(A) is large (or small) in one norm, it will be large (or small, respectively) in
any other norm. See Example 5.
3. The literatme on ill-conditioning is extensive. For an introduction to it, see [E9].
This is the end of our discussion of numerics for solving linear systems. In the next
section we consider curve fitting, an important area in which solutions are obtained from
linear systems.
........ --
-.... ..
.........
==:=:=::: =........_-,._==.-_----,-
-
~~
VECTOR NORMS
Compute (5), (6), (7). Compute a corresponding unit vector
(vector of norm 1) with respect to the too-norm.
1. [1 -6 5]
2. [0.4
3. [-4
4.
5.
6.
7.
8.
- 1.2
4 3
0 8.01
-3]
18. Verify the calculations in Examples 5 and 6 of the text.
119-2°1
ILL-CONDITIONED SYSTEMS
Solve Ax = bb Ax = b 2 • compare the soLutions. and
comment. Compute the condition number of A.
[0 0
0 0]
-0.1
[0.3
0.5 LO]
[L6 21 54 -119]
[1 1 1 1 1 1]
[3 0 0 -3 0]
9. Show that
16. Verify (II) for x = [4 -5 2]T taken with the
loe-norm and the matrix in Prob. L5.
17. Verify (12) for the matrices in Probs. to and 1 L
19. A =
Ilxll oo ~ IIxl12 ~ Ilxlli.
[
LAJ . b i -- [1.4J
2
1.4
1
1
•
b2 -- [1.44J
1
20. A= [ 5 -7J. b= [-2J .b = [-2J
-7
110-151
10
i
3
2
3.1
MATRIX NORMS.
CONDITION NUMBERS
Compute the matrix norm and the condition number
corresponding to the II-vector norm.
21. (Residual) For Ax = b] in Prob. 19 guess what the
residuaL of = [113 -160]T might be (the solution
II.
22. Show that K(A) ~ 1 for the matrix norms (10), (11),
Sec. 20.3, and K(A) ~ V;; for the Frobenius nOim (9),
Sec. 20.3.
13.
[ 57
ro~,
21
10.5
7
5.25
10.5
7
5.25
4.2
7
5.25
4.2
3.5
5.25
4.2
3.5
3
14.
15.
[0~1
01
0.1
0]
o
100
o
'~l
x
being x = [0
I]T). Then calculate and comment.
23. CAS EXPERIMENT. Hilbert Matrices. The 3 X 3
Hilbert matrix is
The n X n HiLbert matrix is Hn = fhjk], where
hjk = l/(j + k - 1). (Similar matrices occur in curve
fitting by Least squares.) Compute the condition number
K(Hn) for the matrix norm corresponding to the loc- (or
ld vector norm, for fl = 2, 3, ... ,6 (or further if you
wish). Try to find a formula that gives reasonable
approximate values of these rapidly growing numbers.
Solve a few linear systems involving an Hn of your
choice.
SEC. 20.5
859
Least Squares Method
24. TEAM PROJECT. Norms. (a) Vector norms in our
text are equivalent, that is, they are related by double
inequalities; for instance,
(a)
I\xl\x ~ I\XI\I ~ IIl\xIL,,:
(b)
-lIxllI ~ IIxlix ~ IIxll l
(18)
1
·
11
Hence if for some x, one norm is large (or small), the
other norm must also be large (or small). Thus in many
investigations the particular choice of a noml is not
essential. Prove (18).
(b) The Cauchy-Schwarz inequality is
(19b)
(c) Formula (10) is often more practical than (9).
Derive (10) from (9).
(d) Matrix normS. lllustrate (11) with examples.
Give examples of (12) with equality as well as with
strict inequality. Prove that the matrix norms (10),
(II) in Sec. 20.3 satisfy the axioms of a nonn
IIL\II ~
IIAI\
= 0 if and only if A = o.
IIkAil
IIA
It is very important. (Proof in Ref. [GR7] listed in
App. 1.) Use it to prove
(l9a)
20.5
o.
+ BII
=
IkIIiAII,
~ IIAII
+ IIBII·
25. WRITING PROJECT. Norms and Their Use in
This Section. Make a list of the most important of the
many ideas covered in this section and write a two-page
report on them.
Least Squares Method
Having discussed numerics for linear systems, we now turn to an important application.
curve fitting. in which the solutions are obtained from linear systems.
In curve fitting we are given 11 points (pairs of numbers) (Xb "1)' ... , (Xn , Yn) and we
want to determine a function f(x) such that
approximately. The type of function (for example. polynomials. exponential functions, sine
and cosine functions) may be suggested by the nature of the problem (the underlying physical
law, for instance), and in many cases a polynomial of a certain degree will be appropriate.
Let us begin with a motivation.
If we require strict equality [(Xl) = )'1, . . . , f(xn) = Yn and use polynomials of
sufficiently high degree, we may apply one of the methods discussed in Sec. 19.3 in
connection with interpolation. However, in certain situations this would not be the
appropriate solution of the acmal problem. For instance. to the four points
(1)
(-1.3.0.103),
(-0.1. 1.099),
(0.2, 0.808),
there corresponds the interpolation polynomial [(x)
:~
Fig. 443.
= X3
-
X
(l.3, 1.897)
+ I (Fig. 443), but if we
/
t=ilx
~
ApprOXimate fitting of a straight line
860
CHAP. 20
Numeric Linear Algebra
graph the points, we see that they lie nearly on a straight line. Hence if these values are
obtained in an experiment and thus involve an experimental error, and if the nature of the
experiment suggests a linear relation, we better fit a straight line through the points (Fig.
443). Such a line may be useful for predicting values to be expected for other values of
x. A widely used principle for fitting straight lines is the method of least squares by
Gauss and Legendre. In the present situation it may be formulated as follows.
Method of Least Squares.
The straight line
(2)
y
=a+
bx
should be fitted through the givell points (Xl, Yl)' ...• (Xn . Yn) so that the sum of
the squares of the distances of those points from the straight line is minimum. where
the distance is measured in the vertical direction (the .v-direction).
The point on the line with abscissa Xj has the ordinate a + bXj. Hence its distance from
(Xj, .\) is h1 - a - bxjl (Fig. 444) and that sum of squares is
n
q =
L
()J - (/ - bXjl.
j~l
q depends on a and b. A necessary condition for q
to
be minimum is
(3)
oq
ob
= - 2L
Xj
(\J - a - b;f)
=
0
(where we sum over j from 1 to n). Dividing by 2, writing each sum as three sums, and
taking one of them to the right, we obtain the result
(4)
an
+b
a~
L.J x·J
+
~
L.J X·J
b~
L.J
= ~ y.
X·2 =
J
L.J
.J
~
L.J
X·V·
J-J'
These equations are called the normal equations of our problem.
Vertical distance of a point (xi' Yj)
from a straight line y = a + bx
Fig. 444.
SEC. 20.5
861
Least Squares Method
E X AMP L E 1
Straight Line
Using the method of lea~t squares. fit a straight line to the four points given in formula (I).
Solutioll.
11
We obtain
= 4,
2::>'J =
2: x/ =
0.1,
2: Yj =
3.43,
2: XjYj =
3.907,
2.3839.
Hence the normal equations are
4ll + O.JOb
=
3.9070
O.lll + 3.43b
=
2.3839.
The solution <rounded to 4D) is a = 0.9601, b = 0.6670. and we obtain the straight line (Fig. 443)
y = 0.9601
•
+ 0.6670T.
Curve Fitting by Polynomials of Degree m
Our method of curve fitting can be generalized from a polynomial y
polynomial of degree III
(5)
where
p(x)
III ~ 11 -
a
+
bx to a
= b o + bix + ... + b",x'Tn
l. Then q takes the form
"
q =
2: (Yj -
p(Xj»2
j~I
and depends on
conditions
III
+
I parameters bo, ... , bm . Instead of (3) we then have
aq
(6)
abo
aq
abm
= 0,
=
0
which give a system of 111 + I normal equations.
In the case of a quadratic polynomial
(7)
the normal equations are (summation from I to n)
+ b 2: Xj + b2 2: x/ = 2: )j
ol1
boL
+ b 2: xl + b 2: x/ = 2:
bo 2: xl + b] 2: x/ + b 2: x/ = 2: xl,vj·
b
(8)
i
Xj
i
2
2
The derivation of (8) is left to the reader.
XjYj
III
+
I
862
CHAP. 20
E X AMP L E 2
Numenc Linear Algebra
Quadratic Parabola by Least Squares
Fit a parabola through the data (0. 5), (2. 4), (4, 1), (6, 6). (8, 7).
Solul;on. For the nonnal equations we need 11 = 5, L.\) = 20,
L)"j = 23, LXj)] = 104. LX/)'j = 696. Hence these equations are
5bo + 20b! + 120b2
20bo
=
LX/
= 120, LX/ = 800.
2.x/ = 5664.
23
+ 120b! + 800b2 = 104
120bo + 800b! + 5664b 2
= 696.
Solving them we obtain the quadratic least squares parabola (Fig. 445)
y = 5.11429 - 1.4l429x
•
+ O.21429x2.
y
8
•
2
•
o
Fig. 445.
2
4
6
8
x
Least squares parabola in Example 2
For a general polynomial (5) the normal equations form a linear system of equations in the
unknowns b o, ... , b"n" When its matrix M is nonsingular, we can solve the system by
Cholesky's method (Sec. 20.2) because then M is positive definite (and symmetric). When
the equations are nearly linearly dependent, the normal equations may become illconditioned and should be replaced by other methods; see [E5], Sec. 5.7, listed in App. 1.
The least squares method also plays a role in statistics (see Sec. 25.9).
11-61
FITTING A STRAIGHT LINE
Fit a straight line to the given points (x, y) by least squares.
Show the details. Check your result by sketching the points
and the line. Judge the goodness of fit.
1. (2,0), (3, 4), (4. lO), (5. 16)
2. How does the line in Prob. I change if you add a point
far above it, say. (3. 20)?
3. (2.5, 8.0), (5.0, 6.9), (7.5. 6.2), (10.0, 5.0)
4. (Ohm's law U = Ri) Estimate the resistance R from
the least squares line that fits (i, U) = (2.0, 104).
(4.0,206), (6.0, 314), (10.0, 530).
5. (Average speed) Estimate the average speed vav of a
car traveling according to s = v • t [km] (s = distance
traveled. t [h] = time) from (t, s)
(11, 3lO). (\2. 4lO).
=
(9, 140), (10, 220),
6. (Hooke's law F = ks) Estimate the spring modulus k
from the force F [lb] and extension s [cm], where
(F, s) = (1, 0.50), (2, 1.02), (4, 1.99), (6, 3.01),
(10.4.98), (20. 10.03).
7. Derive the normal equations (8).
18-10
1
FITTING A QUADRATIC PARABOLA
Fit a parabola (7) to the given points Cx, y) by least squares.
Check by Sketching.
8. (-1,3), (0, 0), (I. 2). (2. 8)
9. (0,4), (2, 2), (4, -1), (6, -5)
SEC 20.6
863
Matrix Eigenvalue Problems: Introduction
10. Worker's time on duty x [h]
2
Worker's reaction lime [sec] 1.50 1.28
lAO 1.85 2.20
16. TEAM PROJECT. The least squares approximation
of a function f(x) on an interval a ;;a x ;;a h by a function
11. Fit (2) and (7) by lea<;t squares to (-1.0,5.4), (-0.5,4.1),
(0,3.9), (0.5,4.8), (1.0, 6.3), (l.5, 9.3). Graph the data
and the curves on common axes and comment.
where Yo(x), ... ,y",(x) are given functions, requires the
determination of the coefficients ao, .•• , am such that
3
4
5
12. (Cubic parabola) Derive the formula for the normal
equations of a cubic least squares parabola.
13. Fit curves (2) and (7) and a cubic parabola by least
squares to (-2, -35), (-1. -9), (0, -I), (I, -I),
(2, 17). (3. 63). Graph the three curves and the points
on common axes. Comment on the goodness of fit.
14. CAS PROJECT. Least Squares. Write programs for
calculating and solving the normal equations (4) and
(8). Apply the programs to Probs. 3, 5, 9, 11. If your
CAS has a command for fitting (Maple and
Mathematica do), compare your results with those by
your CAS commands.
15. CAS EXPERIMENT. Least Squares versus
Interpolation. For the given data and for data of your
choice find the interpolation polynomial and the least
squares approximations (linear. quadratic, etc.).
Compare and comment.
(a) (-2,0), (-I, 0), (0, I), (I, 0), (2, 0)
(b) (-4. 0), (-3. 0), (-2. 0). (-1. 0), (0, I).
(I, 0). (2. 0), (3. 0). (4. 0)
I
(9)
b
[f(x) - Fm(x)]2 dx
a
becomes mmlmum. This integral is denoted by
Ilf - Fm1l 2 , and IIf - Fmll is called the L 2 -norm of
f - F m (L suggesting Lebesgue2 ). A necessary condition
for that minimum is given by allf - F ml12/aaj = 0,
j = 0, ... , m [the analog of (6)]. (a) Show that this
leads to m + 1 normal equations (j = 0, ... , m)
m
2:
hjkak = b j
where
k=O
(10)
hjk =
I
I
b
)'j(X)Yk(X) dx,
a
bj =
b
!(X)'\".i(x) dx.
Q
(C) Choose five points on a straight line, e.g., (0, 0),
(l, 1), ... , (4, 4). Move one point 1 unit upward and
(b) Polynomial. What form does (lO) take if
Fm(x) = a o + lllX + ... + amx"'? What is the
coefficient matrix of (10) in this case when the interval
is 0 ;;a x;;a I?
find the quadratic least squares polynomial. Do this
for each point. Graph the five polynomials on
Cornmon axes. Which of the five motions has the
greatest effect?
(c) Orthogonal functions. What are the solutions of
(10) if .\'o(x), ... , Ym(x) are orthogonal on the interval
a ;;a x ;;a b? (For the definition, see Sec. 5.7. See also
Sec. 5.8.)
20.6
Matrix Eigenvalue Problems: Introduction
In the remaining sections of this chapter we discuss some of the most important ideas and
numeric methods for matrix eigenvalue problems. This very extensive part of numeric
linear algebra is of great practical importance, with much research going on, and hundreds,
if not thousands of papers published in various mathematical journals (see the references
in [ES], [E9], [Ell], [E29]). We begin with the concepts and general results we shall need
in explaining and applying numeric methods for eigenvalue problems. (For typical models
of eigenvalue problems see Chap. 8.)
2HENRI LEBESGUE (1875-1941). great French mathematician. creator of a modern theory of measure and
integration in his famous doctoral thesis of 1902.
864
CHAP. 20
Numeric Linear Algebra
An eigenvalue or characteristic value lor latent root) of a given Il X n matrix A = [ajk]
is a real or complex number A such that the vector equation
Ax
(1)
= Ax
has a nontrivial solution, that is, a solution x "* 0, which is then called an eigenvector or
characteristic vector of A corresponding to that eigenvalue A. The set of all eigenvalues
of A is called the spectrum of A. Equation (I) can be written
lA - AI)x = 0
(2)
where I is the n X n unit matrix. This homogeneous system has a nontrivial solution if
and only if the characteristic determinant det (A - AI) is 0 (see Theorem 2 in Sec. 7.5).
This gives (see Sec. 8.1)
Eigenvalues
THEOREM 1
The eigenvalues of A are the solutions A of the characteristic equation
all -
A
a21
(3)
a 1n
a12
a22 -
A
a2n
det (A - AI) =
= O.
ann -
an2
anI
A
Developing the characteristic determinant, we obtain the characteristic polynomial of A,
which is of degree n in A. Hence A has at least one and at most n numerically different
eigenvalues. If A is real, so are the coefficients of the characteristic polynomial. By familiar
algebra it follows that then the roots (the eigenvalues of A) are real or complex conjugates
in pairs.
We shall usually denote the eigenvalues of A by
with the understanding that some (or all) of them may be equal.
The sum of these n eigenvalues equals the sum of the entries on the main diagonal of
A, called the trace of A; thus
n
(4)
trace A =
L
j~1
n
ajj
=
L
Ak .
k~1
Also, the product of the eigenvalues equals the determinant of A,
(5)
Both formulas follow from the product representation of the characteristic polynomial,
which we denote by f(A),
SEC. 20.6
865
Matrix Eigenvalue Problems: Introduction
If we take equal factors together and denote the numerically distinct eigenvalues of A by
Ab ... , AI" (r ~ 11), then the product become~
(6)
The exponent 111j is called the algebraic multiplicity of Aj • The maximum number of
linearly independent eigenvectors corresponding to Aj is called the geometric multiplicity
of Aj . It is equal to or smaller than 111j.
A subspace S of R n or en (if A is complex) is called an invariant subspace of A if
for every v in S the vector A,- is also in S. Eigenspaces of A (spaces of eigenvectors;
Sec. 8.1) are important invaJiant subspaces of A.
An 11 X 11 matrix B is called similar to A if there is a nonsingular n X 11 matrix T such
that
(7)
Similarity is important for the following reason.
THEOREM 2
Similar Matrices
Similar matrices have the same eigellmlues. If x is an eigenvector of A, then
y = T-Ix is all eigel1l'ector of B in (7) wrrespollding to the sallie eigellmille. (Proof
in Sec. 8.4.)
Another theorem that has vaJious applications in numerics is as follows.
THEOREM 1
Spectral Shift
IfA has the eigenvalues A1> •.• , An' then A - kI with arbitrary k has the eigenvalues
Al - k . .. '. An - k.
This theorem is a special case of the following spectral mapping theorem.
THEOREM 4
Polynomial Matrices
If A is all eigenvalue of A, then
is all eigenvalue of the polynomial matrix
q(A)x
+ O's_lAs - 1 + ... ) x
= asAsx + as_lAs-Ix + .. .
= asAsx + as_lAs-Ix + ... = q(A) x.
=
(asAS
•
CHAP. 20
866
Numeric Linear Algebra
The eigenvalues of important special matrices can be characterized as follows.
Special Matrices
THEOREM 5
The eigenvalues of Hermitian matrices (i.e., -AT = A), hence of real symmetric
matrices (i.e., AT = A), lire real. The eigenvalues of skew-Hermitian matrices (i.e.,
AT = -A), hence of real skew-symmetric matrices (i.e., AT = -A) are pure
imaginary or O. The eigenvalues of unifllry matrices (i.e., AT = A-I). hence of
orthogonalmlltrices (i.e .• AT = A-I), have absolute value l. (Proofs in Secs. 8.3
and 8.5.)
The choice of a numeric method for matrix eigenvalue problems depends essentially on
two circumstances, on the kind of matrix (real symmetric, real general, complex, sparse,
or full) and on the kind of information to be obtained, that is, whether one wants to know
all eigenvalues or merely specific ones, for instance, the largest eigenvalue, whether
eigenvalues lind eigenvectors are wanted. and so on. It is clear that we cannot enter into
a systematic discussion of all these and further possibilities that arise in practice, but we
shall concentrate on some basic aspects and methods that will give us a general
understanding of this fascinating field.
20.7
Inclusion of Matrix Eigenvalues
The whole numerics for matrix eigenvalues is motivated by the fact that except for a few
trivial cases we cannot determine eigenvalues exactly by a finite process because these
values are the roots of a polynomial of Ilth degree. Hence we must mainly use iteration.
In this section we state a few general theorems that give approximations and error
bounds for eigenvalues. Our matrices will continue to be real (except in formula (5) below).
but since (nonsymmetric) matrices may have complex eigenvalues. complex numbers will
playa (very modest) role in this section.
The important theorem by Gerschgorin gives a region conSisting of closed circular disks
in the complex plane and including all the eigenvalues of a given matrix. Indeed. for each
j = 1. .... n the inequality (1) in the theorem detennines a closed circular disk in the
complex A-plane with center {ljj and radius given by the right side of (1); and Theorem
1 states that each of the eigenvalues of A lies in one of these n disks.
Gerschgorin's Theorem
THEOREM 1
Let A be all eigenvalue qf all arbitrary
integer j (l ~ j ~ 11) we have
PROOF
Il
X
Il
matrix A
[ajk]. Then for some
Let x be an eigenvector corresponding to an eigenvalue A of A. Then
(2)
Ax = Ax
or
(A - AI)x = O.
Let Xj be a component of x that is largest in absolute value. Then we have /x11 /Xj/ ~ I for
= 1, ... , n. The vector equation (2) is equivalent to a system of 11 equations for the
I7l
SEC. 20.7
867
Inclusion of Matrix Eigenvalues
n components of the vectors on both sides. The jth of these n equations with j as just
indicated is
Division by Xj (which cannot be zero; why?) and reshuffling terms gives
By taking absolute values on both sides of this equation, applying the triangle inequality
la + bl ~ lal + Ibl (where a and b are any complex numbers), and observing that because
of the choice of j (which is crucial !), IX1lxjl ~ 1, ... , IXn1xjl ~ I, we obtain (l), and the
theorem is proved.
•
E X AMP L E 1
Gerschgorin's Theorem
For the eigenvalues of the matrix
112
5
we get the Gerschgorin disks (Fig. 446)
°
1:
Center 0,
radiu~
I,
02:
Center 5.
radius 1.5.
03:
Center 1.
radius 1.5.
The centers are the main diagonal entries of A. These would be the eigenvalues of A if A were diagonal.
We can take these values as crude approximation~ of the unknown eigenvalues (3D values) Al = -0.209,
A2 = 5.305. A3 = 0.904 (verify this): then the radii of the disks are corresponding error bounds.
Since A is symmetric, it follow, from Theorem 5, Sec. 20.6, that the spectrum of A must actually lie in the
intervals [-1. 2.5] and [3.5, 6.5].
It is interesting that here the Gerschgorin disks form two disjoint sets, namely, 01 U 3 , which contains two
•
eigenvalues, and 02. which contains one eigenvalue. This is typical. as the following theorem shows.
°
y
x
Fig. 446.
THEOREM 2
Gerschgorin disks in Example 1
Extension of Gerschgorin's Theorem
If p Gerschgorin disks fom! a set S that is disjoint from the n - p other disks of a
given matrix A. t!zen S contains precisely p eigenl'Q/ues of A (each cOllnted with its
algebraic multiplicity. as defined in Sec. 20.6).
Idea of Proof. Set A
apply Theorem I to At
= B + C, where B is the diagonal matrix with entries
= B + tC with real t growing from 0 [0 1.
ajj. and
•
868
E X AMP L E 2
CHAP. 20
Numeric Linear Algebra
Another Application of Gerschgorin's Theorem. Similarity
Suppose that we have diagonalized a matrix by some numeric method that left us with some off-diagonal entries
of size 10- 5 , say,
5
10- ]
10-5
2
.
4
What can we conclude about
deviation~
of the
eigenvalue~
from the main diagonal
entrie~?
By Theorem 2. one eigenvalue must lie in the disk of radius 2· 10- 5 centered at 4 and two
5
eigenvalues (or an eigenvalue of algebraic mulliplicity 2) in the disk of radius 2·
centered at 2. Actually,
since the matrix is symmetric. these eigenvalues must lie in the intersections of these disks and the real axis.
by Theorem 5 in Sec. 20.6.
We sho" how an isolated disk can always be reduced in size by a similarity transformation. The matrix
Solution.
10-
o
5
10- [I0
10-
l~
o
5]
o
4
2
0
o
]
;]
is similar to A. Hence by Theorem 2. Sec. 20.6. it has the same eigenvalues as A. From Row 3 we get the smaller
disk of radius 2· 10- 10. Note that the other disks got bigger. approximately by a factor of 105 . And in choosing
T we have to watch lhat the new disks do not overlap with the disk whose size we want to decrease.
For funher interesting facts, see the new book [E28j.
•
By definition, a diagonally dominant matrix A = [ajk] is an n X n matrix such that
lajjl
(3)
~
L lajkl
j
=
I,"', n
k*j
where we sum over all off-diagonal entries in Row j. The matrix is said to be strictly
diagonally dominant if > in (3) for all j. Use Theorem I to prove the following basic
property.
THEOREM 3
Strict Diagonal Dominance
Strictly diagonally dominant matrices are nOllSingular.
Further Inclusion Theorems
An inclusion theorem is a theorem that specifies a set which contains at least one
eigenvalue of a given matlix. Thus. Theorems I and 2 are inclusion theorems; they even
include the whole spectrum. We now discuss some famous theorems that yield further
inclusions of eigenvalues. We state the first two of them without proofs (which would
exceed the level of this book).
SEC. 20.7
869
Inclusion of Matrix Eigenvalues
THEOREM 4
Schur's Theorem 3
Let A = [ajk] be an
11
X n matrix. Then for each of its eigenvalues A1> •.• , An'
n
(4)
n
n
\AmJ 2 ~ L \Ai\2 ~ L L
i=l
In (4) the second equality sign holds
\ajk\2
(Schur's inequality).
j=l k=l
if alld ollly if A is such that
(5)
Matrices that satisfy (5) are called normal matrices. It is not difficult to see that Hermitian,
skew-Hermitian, and unitary matrices are normal, and so are real symmetric, skew-symmetric,
and orthogonal matlices.
E X AMP L E 1
Bounds for Eigenvalues Obtained from Schur's Inequality
For the matrix
we obtain from Schur's inequality IAI ;;: v'I949 = 44.1475. You may verify that the eigenvalues are 30. 25,
•
and 20. Thus 302 + 25 2 + 202 = 1925 < 1949; in fact, A is not normal.
The preceding theorems are valid for every real or complex square matrix. Other theorems
hold for special classes of matrices only. Famous is the following.
THEOREM 5
Perron's Theorem4
Let A be a real n X n matrix whose e1l1ries are all positive. Then A has a positive
real eigenvalue A = p of multiplicity I. The corresponding eigenvector can be chosen
with all components positive. (The other eigenvalues are less than p in absolute
value.)
For a proof see Ref. [B3], vol. II, pp. 53-62. The theorem also holds for matrices with
nonnegative real entries ("Perron-Frobenius Theorem,,4) provided A is irreducible,
that is, it cannot be brought to the following form by interchanging rows and columns:
here Band F are square and 0 is a zero matrix.
:ISSAI SCHUR (1875-1941), German mathematician, also known by his important work in group theory.
OSKAR PERRON (1880-1975), GEORG FROBENIUS (1849-1917), LOTHAR COLLATZ (1910-1990),
German mathematicians, known for their work in potential theory, ODEs (Sec. 5.4) and group theory, and
numerics. respectively.
870
CHAP. 20
Numeric Linear Algebra
Perron's theorem has various applications, for instance, in economics. It is interesting
that one can obtain from it a theorem that gives a numeric algorithm:
Collatz Inclusion Theorem4
THEOREM 6
Let A = [ajk] be a real n X 11 matrLl: whose elements are all positive. Let x be a1l\'
real vector whose components Xl' • • . • Xn are positive, alld let )'1> •••• Yn be the
components of the vector y = Ax. Then the closed imen'al 011 the real axis bounded
by the smallest and the largest of the n quotients % = )'/Xj contains at least one
eigenvalue of A.
PROOF
We have Ax
=
yor
(6)
y - Ax
= O.
The transpose AT satisfies the conditions of Theorem 5. Hence AT has a positive eigenvalue
A and, corresponding to this eigenvalue. an eigenvector u whose components Uj are all
positive. Thus ATu = Au, and by taking the transpose we obtain uTA = AUT. From this
and (6) we have
or written out
n
L
u/Yj -
AXj)
= O.
j~l
Since all the components
Uj
are positive, it follows that
Yj - A.r:j
~
O.
that is.
for at least one j.
Yj - ,\xj
~
0,
that is,
for at least one j.
(7)
and
Since A and AT have the same eigenvalues. A is an eigenvalue of A. and from (7) the
statement of the theorem follows.
•
E X AMP L E 4
Bounds for Eigenvalues from Collatz's Theorem. Iteration
For a given matrix A with positive entries we choose an x = "0 and iterate, that is, we compute
Xl = Axo· X2 = AXI • . . . • X20 = AX19' In each step, taking x = ":i and y = AXj = Xj+l we compute an
inclusion interval by Collatz's theorem. This gives (6S)
0.02
A =
[ 049
0.02
0.28
0.22]
0.20
0.22
0.20
0.40
•• ' . xl9
=
,xo ~
[I]
1
0.002 16309J
0.00108155
[
0.00216309
,Xl =
I
,x20
[0.73]
0.50
,X2 ~
0.82
=
[0.00155743 J
0.000778713
0.00155743
[0.5481]
0.3186 ,
0.5886
SEC. 20.7
871
Inclusion of Matrix Eigenvalues
and the intervals 0.5
have length
~
j
Length
0.32
A ~ 0.82, 0.3186/0.50
= 0.6372 ~
A ~ 0.548l/0.73
= 0.750822, etc. These intervals
2
3
10
15
20
0.113622
0.0539835
0.0004217
0.0000132
0.0000004
Using the characteristic polynomial. you may verify that the eigenvalues of A are 0.72. 0.36. 0.09. so that those
intervals include the largest eigenvalue. 0.72. Their lengths decreased withj. so that the iteration was worthwhile.
•
The reason will appear in the next ~ection, where we discuss an iteration method for eigenvalues.
PROBLEM SET 2<h.7
GERSCHGORIN DISKS
11-61
Find and sketch disks or interval~ that contain the
eigenvalues. If you have a CAS, find the spectrum and
compare.
1.[-~
2.
-: ]
[1~_2
10- 2
10- 2
10- 2
9. If a symmetric 11 X 11 matrix A = [ajkl has been
diagonalized except for small off-diagonal entries of
size 10- 6 , what can you say about the eigenvalues?
10. (Extended Gerschgorin theorem) Prove Theorem 2.
11. Prove Theorem 3.
2
10- ]
10-2
8
8. By what integer factor can you at most reduce the
Gerschgorin circle with center 3 in Prob. 6?
12. (Normal matrices) Show that Hermitian, skewHermitian, and unitary matrices (hence real symmetric.
skew-symmetric, and orthogonal matrices) are normal.
Why is this of practical interest?
peA»~ Show that p(A) cannot be
greater than the row sum norm of A.
13. (Spectral radius
9
14. (Eigenvalues on the circle) Illustrate with a 2
X
2
matrix that an eigenvalue may very well lie on a
Gerschgorin circle (so that Gerschgorin disks can
generally not be replaced with smaller disks without
losing the inclusion property).
1
4.
+
0.3
i
0.3i
-3
+ 2i
115--171
0.5i]
0.1
[
5.
[
O.li
0.2
4i
0.1i
O.li
o
o
-1
+
10
6.
[
i
4-i
0.1
0.1
6
-0.2
o
SCHUR'S INEQUALITY
Use (4) to obtain an upper bound for the spectral radius:
15. In Prob. I
16. In Prob. 6
17. In Prob. 3
118-191
COLLATZ'S THEOREM
Apply Theorem 6, choosing the given vectors as vectors x.
10
I
18.
9
[
3
2
7. (Similarity) Find T-TAT such that in Prob. 2 the
radius of the Gerschgorin circle with center 5 is reduced
by a factor III 00.
'9. [:
4
2
J[J[J[J
CHAP. 20
872
Numeric Linear Algebra
(b) Apply the program to symmetric matrices of your
choice. Explore how convergence depends on the
choice of initial vectors. Can you construct cases in
which the lengths of the inclusion intervals are not
monotone decreasing? Can you explain the reason?
Can you experiment on the effect of rounding?
20. CAS EXPERIMENT. Collatz Iteration. (a) Write
a program for the iteration in Example 4 (with any
A and xo) that at each step prints the midpoint
(why?). the endpoints. and the length of the inclusion
interval.
20.8
Power Method for Eigenvalues
A simple standard procedure for computing approximate values of the eigenvalues of an
n X n matrix A = [ajd is the power method. In this method we start from any vector
xo (* 0) with n components and compute successively
Xs
=
AX.<_l·
For simplifying notation, we denote Xs-l by x and Xs by y. so that y = Ax.
The method applies to any n X n matrix A that has a dominant eigenvalue (a A such
that
is greater than the absolute values of the other eigenvalues). If A is symmetric. it
also gives the error bound (2), in addition to the approximation (1).
IAI
THEOREM 1
Power Method, Error Bounds
Let A be an n X n real symmetric matrix. Let x (* 0) be any real vector with n
components. Furthermore. let
y
= Ax,
mo
=
1111
XTX,
=
xTy,
Then the quotient
(I)
(Rayleigh 5 quotient)
q=
is lin lIpproximation for an eigenvalue A of A (usually that which is greatest in
absolute value, but no general statements are possible).
FlIrthe17llore.
if we set q =
A-E.
so that
E
is the error of q. then
(2)
5 LORD RAYLEIGH (JOHN WILLIAM STRUTT) (1842-1919), great English physicist and mathematician.
professor at Cambridge and London, known for his important contributions to various branches of applied
mathematics and theoretical physics. in particular. the theory of waves. elasticity. and hydrodynamics. In 1904
he received a Nobel Prize in physics.
SEC. 20.8
873
Power Method for Eigenvalues
PROOF
fJ2
denotes the radicand in (2), Since
1111
= qmo by ( I). we have
Since A is real symmetric, it has an orthogonal set of n real unit eigenvectors ZI' . . . , zn
corresponding to the eigenvalues AI, . . . , An' respectively (some of which may be equal).
(Proof in Ref. [B3], vol. I, pp. 270-272, listed in App. I.) Then x has a representation of
the fmm
Now
AZI
=
AIZI,
and. since the
Zj
etc., and we obtain
are orthogonal unit vectors.
(4)
It follows that in (3),
Since the
Zj
are orthogonal unit vectors, we thus obtain from (3)
Now let Ae be an eigenvalue of A to which q is closest, where c suggests ·'closest"". Then
(Ae - q)2 ~ (AJ - q)2 for j = 1, ... , n. From this and (5) we obtain the inequality
Dividing by
1110,
taking square roots. and recalling the meaning of
This shows that fJ is a bound for the error
A and completes the proof.
E
fJ2
gives
of the approximation q of an eigenvalue of
•
The main advantage of the method is its simplicity. And it can handle sparse matrices
too large to store as a full square array. Its disadvantage is its possibly slow convergence.
From the proof of Theorem ] we see that the speed of convergence depends on the ratio
of the dominant eigenvalue to the next in absolute value (2:] in Example I, below).
If we want a convergent sequence of eigenvectors, then at the beginning of each step
we scale the vector, say, by dividing its components by an absolutely largest one, as in
Example 1, as follows.
874
E X AMP L E 1
CHAP. 20
Numeric Linear Algebra
Application of Theorem 1. Scaling
For the symmetric matrix A in Example 4, Sec. 20.7. and Xo
indicated scaling
A
=
O.4Y
0.02
om
0.28
0.22
0.20
[I]
0.22]
[
~::~ , Xo
=
XlO
=
~
~.504682
[0.890244]
,
Xl
[0.931193]
~.609756 , X2
=
0,999707]
0,990663]
X5 =
I] T we obtain from (I) and (2) and the
= [I
.
~.500146
0,999991]
.
~.500005
Xl5 =
.
[
[
[
~.541284
=
Here AXV = [0.73 0.5 0.S2IT, scaled to Xl = [0.73/0.82 0.5/0.82 lIT. etc. The dominam eigenvalue is
0.72, an eigenvector [I 0.5 lIT. The corresponding q and 8 are computed each time before the next scaling.
Thus in the first step.
XOTAxO
2.05
- - - T - = -3- = 0.683333
Xo Xo
,, __ (1112 _ q2)1/2 __
v
1110
(AXo)TAxo
2)112
T
- q
= (1.4553
- - - q 2)112 = 0.134743.
Xo Xo
3
This gives the foIlowing values of q, 8, and the error
j
€
=
0.72 - q (calculations with 10D, rounded to 6D):
2
5
10
q
0.683333
0.716048
0.719944
0.720000
8
0.134743
0.036667
0.038887
0.003952
0.004499
0.000056
0.000l41
5' 10-8
€
The error bounds are much larger than the actual errors. This is typicaL although the bounds cannot be improved:
that is, for special symmetric matrices [hey agree with the errors.
Our present results are somewhat better than those of CoIlatz'~ method in Example 4 of Sec. 20.7, at the
expense of more operations.
•
Spectral shift, the transition from A to A - kI, shifts every eigenvalue by -k. Although
finding a good k can hardly be made automatic, it may be helped by some other method
or small preliminary computational experiments. In Example I. Gerschgorin' s theorem
gives -0.02 ~ A ~ 0.82 for the whole spectrum (verify!). Shifting by -0.4 might be too
much (then -0.42 ~ A ~ 0.42), so let us try -0.2.
E X AMP L E 2
Power Method with Spectral Shift
For A - 0.21 with A as in Example I we obtain the following substantial improvements (where the index 1
refers to Example I and the index 2 to the present example).
j
81
82
€l
€2
0.134743
0.134743
0.036667
0.036667
2
5
10
0.038887
0.034474
0.003952
0.002477
0.004499
0.000693
0.000056
1.3. 10-6
0.000141
1.8· 10-6
5 '10- 8
9' 10- 12
•
SEC. 20.9
875
Tridiagonalization and QR-Factorization
PRO B. L EMS E1::::..2 D-;-8_
11-71
°
POWER METHOD WITH SCALING
Apply the power method (3 steps) with scaling. using
= [I
I]T or [I I I]T or [I I I I]T. as
applicable. Give Rayleigh quotients and error bounds.
Show the details of your work.
Xo
3.5
1. [
3.
0.6
2.0J
2. [
U.5
2.0
11. (Spectral shift, smallest eigenvalue) In Prob. 5 set
B = A - 31 (as perhaps suggested by the diagonal
entries) and try whether you may get a sequence of q's
converging to an eigenvalue of A that is smallesT (not
largest) in absolute value. Use Xv = [I I nT. Do
8 step". Verify that A has the spectrum (0.3.51.
12. CAS EXPERIMENT. Power Method with Scaling.
Shifting. (a) Write a program for 11 X 11 matrices that
prints every step. Apply it to the (nonsymmetric!)
matrix (20 steps), starting from [1 I If.
0.8J
-0.6
0.8
[~ ~J
.{; :J
5'[-~ -~ ~]
I
2
9. Prove that if x is an eigenvector, then 8 =
in (2).
Give two examples.
10. (Rayleigh quotient) Why does q generally
approximate the eigenvalue of greatest absolute value?
When will q be a good approximation?
A
3
=
°
4
4
-I
°
2
5
8
3
7.
6.
0
2
3
2
o
8
2
-2
°
° °
°
3
o
5
8. (Optimality of 8) In Prob. 2 choose Xo = [3 _l]T
and show that q = 0 and 8 = I for all steps and that the
eigenvalues are ::'::: 1. so that the interval [q - 8, q + 8]
cannot be shortened in general! Experiment with
other xo.
20.9
[
::
-19
~: I:] .
- 36
-7
(b) Experiment in (a) with shifting. Which shift do you
find optimal?
(c) Write a program as in (a) but for symmetric
matrices that prints vectors. scaled vectors, q, and 8.
Apply it to the matrix in Prob. 6.
(d) Find a (nonsymmetric) matrix for which 8 in (2)
is no longer an error bound.
(e) Experiment systematically with speed of
convergence by choosing matrices with the second
greatest eigenvalue (i) almost equal to the greatest, (ii)
somewhat different, (iii) much different.
Tridiagonalization and QR-Factorization
We consider the problem of computing all the eigenvalues of a real symmetric matrix
A = [ajk]' discussing a method widely used in practice. In the first stage we reduce [he
given matrix stepwise to a tridiagonal matrix, that is, a matrix having all its nonzero
entries on the main diagonal and in the positions immediately adjacent to the main diagonal
(such as A3 in Fig. 447, Third Step). This reduction was invented by A. S. Householder
(1. Assn. Comput. Machinery 5 (1958), 335-342). See also Ref. [E29] in App. I.
This Householder tridiagonalization will simplify the matrix without changing its
eigenvalues. The latter will then be determined (approximately) by factoring the
tridiagonalized matrix, as discussed later in this section.
876
CHAP. 20
Numeric Linear Algebra
Householder's Tridiagonalization Method
An 11 X 11 real symmetric matrix A = [ajk] being given, we reduce it by Il - 2 successive
similarity transformations (see Sec. 20.6) involving matrices PI' ... , P n-2 to tridiagonal
form. These matrices are orthogonal and symmetric. Thus Pi l = PIT = PI and similarly
for the others. These transformations produce from the given Ao = A = [lIjk] the matrices
Al = raJ};]. A2 = [aj~)], .... A n - 2 = [a3k- 2 )] in the form
Al = PIAoPl
A2
(1)
=
P 2A I P 2
The transformations (1) create the necessary zeros, in the first step in Row I and Column
1, in the second step in Row 2 and Column 2, etc., as Fig. 447 illustrates for a 5 X 5
matrix. B is tridiagonal.
How do we determine PI' P 2 , . . . Pn - 2 ? Now, all these P r are of the form
where I is the
0; thus
(3)
= I •... , 11
(r
(2)
11
X 11
-
2)
unit matrix and Vr = [Vjr] is a unit vector with its first r components
o
o
o
*
o
o
*
*
V n -2
=
*
*
*
*
where the asterisks denote the other components (which will be nonzero in general).
Step 1. VI has the components
Vn
= 0
(a)
(4)
j = 3. 4 .... ,
(b)
11
where
(c)
Sl
=
\/a212 + a3I 2 + ... + a nI 2
where SI > 0, and sgn a21 = + 1 if a21 ~ 0 and sgn lI21 = -1 if a21
compute PI by (2) and then Al by (I). This wa<; the first step.
< O. With this we
SEC 20.9
Tridiagonalization and QR-Factorization
877
[.~~; .. *"'J
,.
o.!_
[:~:::** _.....
[::::* * :::*.:=;::J
..~.!'*:
*
*.
1
r:
:;: -
First Step
Second Step
Third Step
At =PtAP t
A" = P,At P 2
~ =P3 A"P j
Fig. 447. Householder's method for a 5 X 5 matrix.
Positions left blank are zeros created by the method.
Step 2. We compute v 2 by (4) with all subscripts increased by 1 and the
aj~), the entries of Al just computed. Thus [see also (3)]
ajk
replaced by
-I ( 1 +la~~1
- -)
2
S2
(4*)
j = 4,5, ... , n
where
+
a~~
2
With this we compute P2 by (2) and then A2 by (1).
Step 3. We compute V3 by (4*) with all subscripts increased by 1 and the ajil replaced
by the entries aj'fl of A2 , and so on.
E X AMP L E 1
Householder Tridiagonalization
Tridiagonalize the real symmetric matrix
4
6
5
2
Solution. Step
sgn
G21
=
2
1. We compute Sl2 = 4 + ]2 + ]2 = l!l from (4c). Since
(4) by straightforward computation
Q21 =
+ 1 in (4bl and get tram
0 1 [00.119573 161.
[
0.985 598 56
V21
VI =
V31
V41
=
0.119 573 16
From this and (2).
o
o
-~.235
-0.942 809 04
-0.235 702 27
-0.235 702 27
0.97140452
-0.G28 595 48
-0.235 702 27
-0.028 595 48
0.97140452
70227J
4 > 0, we have
878
CHAP. 20
Numeric Linear Algebra
From the first line in (I) we now get
-Vt8
Step 2. From (4*) we compute
S22
o
7
-I
-(
-I
9/2
312
-I
3/2
912
01
= 2 and
[:~]
V2 =
=
[:.923879 ,,].
0.38268343
V42
From this and (2).
P,
~ [~
0
-l~
0
0
0
0
-Itv'2
0
-1V2
]
Itv'2
The ",cond line in (I) now gives
[,
-Vt8
0
7
,/2
Y2
6
0
0
-v'18
B = A2 = P2 A I P2 =
0
0
~l
This matrix B is tridiagonal. Since our given matrix has order 11 = 4, we needed 11 - 2 = 2 steps to accomplish
this reduction. as claimed. (Do yOll see that we got more zeros than we can expect in general?)
B is similar to A, as we now show in general. This is essential because B thus has the same spectrum as A,
by Theorem 2 in Sec. 20.b.
•
We assert that Bin (1) is similar to A = Ao. The matrix PT is symmetric;
B Similar to A.
indeed,
T
TT
T
TT
T
Pr = (I - 2v,.vr) = 1 - 2(v rvr ) = [ - 2v vr = Pro
T
Also, PT is orthogonal because
T
P."Pr
=
Pr
2
=
Vr
is a unit vector. so that
T2
(I - 2vrvr )
T
V,T V T
=
1 and thus
T
T
= 1 - 4v rvr + 4v.rv r V."Vr
T
T
T
= 1 - 4v,.vr + 4vAvr vr)Vr = I.
Hence p;l
= pr T =
Pr and from (1) we now obtain
B
= P11-2A11-3Pn-2 = ...
. . . = Pn - 2 Pn - 3
-1
•••
-}
= Pn - 2 P n - 3
P I AP1
...
-1
•.•
PI API"
Pn - 3 Pn - 2
. Pn - 3 Pn - 2
1
= P- AP
where P
=
P IP 2 ... P n-2' This proves our assertion.
•
SEC. 20.9
879
Tridiagonalization and QR-Factorization
QR-Factorization Method
In 1958 H. Rutishauser of Switzerland proposed the idea of using the LU-factorization
(Sec. 20.2; he called it LR-factorization) in solving eigenvalue problems. An improved
version of Rutishauser's method (avoiding breakdown if certain submatrices become
singular, etc.; see Ref. [E291) is the QR-method, independently proposed by the American
I. G. F. Francis (Computer 1. 4 (1961-62), 265-271, 332-345) and the Russian
V. N. Kublanovskaya (Zhllmal v.ych. Mal. i Mat. Fi::.. 1 (1961),555-570). The QR-method
uses the factorization QR with orthogonal Q and upper triangular R. We discuss the
QR-method for a real symmetric matrix. (For extensions to general matrices see Ref. rE29]
in App. 1.)
In this method we first transform a given real symmetric f1 X 11 matrix A into a
tridiagonal matrix Bo = B by Householder's method. This creates many Leros and thus
reduces the amount of further work. Then we compute B 1 , B 2 , . . . stepwise according to
the following iteration method.
Step 1. Factor Bo = QoRo with orthogonal Ro and upper triangular
Bl = RoQo·
Ro.
Then compute
Step 2. Factor Bl = QIRI' Then compute B2 = R 1 Ql'
Ge1leral Step s
+ 1.
(5)
Here Qs is orthogonal and Rs upper triangular. The factorization (Sa) will be explained
below.
B 8 + 1 Similar to B. Convergence to a Diagonal Matrix. From (5a) we have Rs = Q;lBs'
Substitution into (5b) gives
(6)
Thus Bs+l is similar to Bs' Hence Bs+l is similar to Bo = B for all s. By Theorem 2,
Sec. 20.6, this implies that Bs+l has the same eigenvalues as B.
Also, Bs+l is symmetric. This follows by induction. Indeed. Bo = B is symmetric.
Assuming Bs to be symmetric, that is, BsT = Bs, and using Q;l = Qs T (since Qs is
orthogonal), we get from (6) the symmetry,
If the eigenvalues of B are different In absolute value, say,
then
lim
s_oc
Bs =
IAll > IA21 > ... > IAnl,
D
where D is diagonal, with main diagonal entries
listed in App. I.)
"-]0
"-2, ... ,
An. (Proof in Ref. [E29J
880
CHAP. 20
Numeric Linear Algebra
How to Get the QR-Factorization, say, B = Bo = [bjk ] = QoRo. The tridiagonal
matrix B has 11 - I generally nonzero entries below the main diagonal. These are
b2l • b32 , . . . • bn,n-l' We multiply B from the left by a matrix C 2 such that C 2 B = [b)~]
has bW = O. We multiply this by a matrix C3 such that C3 C 2 B = [bj'r] has b~~ = O. etc.
After 11 - I such multiplications we are left with an upper triangular matrix Ro. namely.
(7)
These
11
X
f1
matrices Cj are very simple. Cj has the 2
X
2 submatrix
(OJ suitable)
in Rows j - 1 and j and Columns j - 1 and j; everywhere else on the main diagonal the
matrix Cj has entries 1; and all its other entries are O. (This submatrix is the matrix of a
plane rotation through the angle OJ; see Team Project 28. Sec. 7.2.) For instance. if
fl = 4. writing Cj = cos OJ. sJ = sin OJ. we have
C2
=
~
- 'C,
2
S2
0
C2
0
0
0
0
0
0
~l4 ~ ~
0
0
Cg
S3
-.1'3
C3
0
0
o
0
o
These Cj are orthogonal. Hence their product in (7) is orthogonal, and so is the inverse
of this product. We call this inverse Qo. Then from (7),
(8)
where, with Ci 1
= C/o
(9)
This is our QR-factorization of Bo. From it we have by (5b) with s = 0
(10)
We do not need Qo explicitly, but to get Bl from (10). we first compute R OC 2 T. then
(ROC 2 T)C3 T, etc. Similarly in the further steps that produce B2 • B 3 , • • • •
Determination of cos OJ and sin OJ. We finally show how to find the angles of rotation.
cos f)2 and sin ~ in C 2 must be such that b~) = 0 in the product
o
o
SEC 20.9
881
Tridiagonalization and QR-Factorization
Now b~21 is obtained by multiplying the second row of C 2 by the first column of B,
cos 82 =
v'l
(II)
E X AMP L E 2
VI]
82
tan 82
sin 82 =
Similarly for 83 , 84 ,
+ tan 2
V]
+
+
(b 21 /b n
)2
b21lb n
V]
tan 2 82
+
(b 21 /b n
)2
The next example illustrates all this.
.. '.
QR-Factorization Method
Compute all the eigenvalues of the matrix
4
6
5
Solution.
We fust reduce A to tridiagonal foml. Applying Householder's method. we obtain (see Example I)
[
A2 =
-~!J8 -~T8
o
,\;'2
o
v2
6
o
0
o
From the characteristic determinant we see that A 2 • hence A, has the eigenvalue 3. (Can you see this directly
from A2?1 Hence it suffices to apply the QR-method to the tridiagonal 3 X 3 matrix
'.
~ ~ -~Is
•
-,\;18
7
[
v2
Step I. We multiply B from the lefl by
C2 =
[COO ~
sin
f)2
-s~
cos
f)2
f)2
]
0
Here (-sin O2 )' 6 + (cos 02)(-~)
With these values we compute
and lhen C 2 B by
= 0 gives
7,34846<) 23
C2 B
=
[
C,
~
0
[:
cos
1.154 700 54
o
1.414213 56
6.00000000
(cos
7.348469 23
= C g C2 B =
[
f)3) •
f)3
= -0.57735027.
.
1.414 213 56 = 0 the values cos
-7.50555350
f)3]'
-0.816 496 58]
3.26598632
.!.
cos
f)3
(II) cos O2 = 0.81649658 and sin f)2
-7.50555350
Si:
f)3
- ,in
0
In C 3 we get from (- sin f)3) • 3.265 <)86 32
and sin f)3 = 0.39735971. This gives
Ro
0]
f)3
-0.8\649658J
0
3.55902608
3.443 784 13
o
o
5.04714615
.
= 0.917 662 94
CHAP. 20
882
Numeric Linear Algebra
From this we compute
Bl
= RoC2
T
C3
T
=
10.333 333 33
- 2.054 804 67
-:.05480467
4.03508772
:.005 532 51 ]
2.00553251
4.63157895
[
which is symmetnc and tridiagonal. The off-diagonal entries in BI are still large in absolute value. Hence we
have to go on.
Step 2. We do the same computations as in the first step. with Bo = B replaced by Bl and C z and C 3 changed
accordingly, the new angles being 62 = 0.196291 533 and 63 = 0.513415589. We obtain
-2.802322 -II
[ 1053565375
Rl
~
-0.391 145 88]
0
4.08329584
3.98824028
0
0
3.06832668
1
and from this
B2 =
[ 10.8" 879 88
-0.79637918
-:.796379 18
5.44738664
~50702500
1.50702500
2.672 733 48
We see that the off-diagonal entries are somewhat smaller in absolute value than those of B I . but ,till much too
large for the diagonal entries to be good approximations of the eigenvalues of B.
Further Steps.
Ib~~)1 = Ib~i)1 in all
We list the main diagonal entries and the absolutely largest off-diagonal entry, which is
steps. You may show that the given matrix A has the spectrum I 1.6,3,2.
(J)
Stepj
b\·i)
b<.i)
HJ)
maXj*k Jbjkl
3
5
7
9
10.966892 9
10.9970872
10.999742 I
10.999977 2
5.94589856
6.00] 81541
6.00024439
6.00002267
2.08720851
2.00109738
2.00001355
2.00000017
0.58523582
0.12065334
0.035911 07
0.01068477
11
22
33
•
Looking back at our discussion, we recognize that the purpose of applying Householder's
tridiagonalization before the QR-factorization method is a substantial reduction of cost in
each QR-factorization. in particular if A is large.
Convergence acceleration and thus further reduction of cost can be achieved by a
spectral shift, that is, by taking Bs - ksI instead of Bs with a suitable ks . Possible choices
of ks are discussed in Ref. [E291, p. 510.
---.
.•.
"1-41
-......
HOUSEHOLDER TRIDIAGONALIZATION
Tridiagonalize. showing the details:
1.
[
3.5
1.0
1.0
5.0
3.0
1.5
3.0
3.5
0.98
3.
0.04
1.5J
~[
0.04
0.56
OAO
0.44
0.40
0.80
[
8
2
2
8
8
2
2
2
2
6
4
2
2
4
6
4.
0
]
15-91
0.44]
8
QR-FACTORIZATION
Do three QR-steps to find approximations of the
eigenvalues of:
5. The matrix in the answer to Prob 1
883
Chapter 20 Review Questions and Problems
6. The matrix in the answer to Prob. 3
9.
0,:]
0.2
4.1
,,
1. What are the main problem
algebra?
0.1
0.1
4.0
ro
O.~]
1.0
0.1
10. CAS EXPERIMENT. QR-Method. Try to find out
experimentally on what properties of a matrix the speed
of decrease of off-diagonal entries in the QR-method
depends. For this purpose write a program that first
tridiagonalizes and then does QR-steps. Try the
program out on the matrices in Probs. 1. 3. and 4.
Summarize your findings in a short report.
-U.I
-4.3
7.0
area~
=1 :
in numeric linear
S T ION SAN D PRO B L EMS
Xl
+
X2
+
X3
S
2. What is pivoting? When and how would you apply it?
Xl
+
2X2
+
2x3
6
3. What happens if you apply Gauss elimination to a
system that has no solutions?
Xl
+
2x2
t-
3X3
8
4. What is Doolittle's method? Its connection to Gauss
elimination?
17.
18.
5X1
5. What is Cholesk)"s method? When would you apply it?
19.
10. Why are similarity transformations of matrices important
in designing numeric methods? Give examples.
11. What is the power method for eigenvalues? What are
its advantages and disadvantages?
13. State Schur's inequality and give some applications of
it.
GAUSS ELIMINATION
116-]91
Solve:
16.
4x2 5Xl
+
6Xl -
3X2
+
7X2
+
=
11.8
X3 =
34.2
3X3
2X3
=
-3.1
-10
3X2
+
+
21:1
X2
9X3
0
=
3X3
=
IS
X3
=
-13
+ SX3 =
26
20. Solve Prob. 17 by Doolittle's method.
21. Solve Prob. 17 by Cholesky's method.
122-24 1 INVERSE MATRIX
Compute the inverse of:
22.
14. What is tridiagonalization? When would you apply it?
15. What is the idea of the QR-method? When would you
apply the method?
- SX 2 + ISx 3
3x 1 -
12. State Gerschgorin's theorem from memory. Can you
remember its proof?
17
X2
4X2 -
8. What is least squares approximation? What are the
normal equations?
9. What is an eigenvalue of a matrix? Why are eigenvalue
problems important? Give typical examples.
3x3
2Xl -
6. What do you know about the convergence of the
Gauss-Seidel method?
7. What is ill-conditioning? What is the condition number
and its significance?
+
23.
r"
2.0
05]
O.S
0.5
1.0
1.5
2.0
1.0
2.0
10]
r'
2.0
3.S
1.0
1.S
I.S
9.0
CHAP. 20
884
Numeric Linear Algebra
125-261 GAUSS-SEIDEL METHOD
Do 3 steps without scaling, starting from [1
25.
+ 15x2 -
Xl
X3
+ 3x2
lOx I
11
=
-17
5
34. In Prob. 18
35. In Prob. 19
136-381 CONDITION NUMBER
Compute the condition number (corresponding to the
Coo-vector norm) of the coefficient matrix:
36. In Prob. 22
37. In Prob. 23
38. In Prob. 24
@9-40 1
FITTING BY LEAST SQUARES
Fit:
VECTOR NORMS
127-321
Compme the f\-, C2 -, and Ceo-norms of the vectors
27. [0 4 -8 3]T
28. [3
29.
30.
31.
32.
8
[-4
[0
-ll]T
1
0
0
2]T
O]T
[-5
-2
7
[0.3
1.4
0.2
0
O]T
-0.6]T
133-351 MATRIX NORM
Compute the matrix norm corresponding to the Ccc-vector
norm for the coetlicient matrix:
33. In Prob. 17
39. A straight line to (- 2. O. \). (0. 1.9). (2. 3.8), (4, 6.\),
(6,7.8)
40. A quadmtic pambola to O. 9). (2, 5), (3. 4), (4. 5). (5. 7)
141-431 EIGENVALUES
Find three circular disks that must contain all the eigen values
of the matrix:
41. In Prob. 22
42. In Prob. 23
43. Tn Prob. 24
44. (Power method) Do 4 sleps of the power method for
the matrix in Prob. 24. starting from [I I I]T and
computing the Rayleigh quotients and error bounds.
45. (Householder and QR) Tridiagonalize the matrix in
Prob. 23. Then apply 3 QR steps. (Spectrum (6S):
9.65971, 4.07684. 0.263451)
Numeric Linear Algebra
Main tasks are the numeric solution of linear systems (Secs. 20.1-20.4), curve fitting
(Sec. 20.5). and eigenvalue problems (Secs. 20.6-20.9).
Linear systems Ax = b with A = [ajk]' written out
(1)
can be solved by a direct method (one in which the number of numeric operations
can be specified in advance. e.g., Gauss's elimination) or by an indirect or iterative
method (in which an initial approximation is improved stepwise).
885
Summary of Chapter 20
The Gauss elimination (Sec. 20.1) is direct, namely, a systematic elimination
process that reduces (1) stepwise to triangular fonn. In Step I we eliminate Xl from
equations E2 to En by subtracting (a2I/an) EI from E 2, then (a3I/an) EI from E 3.
etc. Equation EI is called the pivot equation in this step and an the pivot. In Step
2 we take the new second equation as pivot equation and eliminate X2, etc. If the
triangular fonn is reached, we get Xn from the last equation, then Xn-I from the
second last. etc. Partial pivoting (= interchange of equations) is necessary if
candidates for pivots are zero, and advisable if they are small in absolute value.
Doolittle's, Crout's, and Cholesky's methods in Sec. 20.2 are variants of the
Gauss elimination. They factor A = LV (L lower triangular, U upper triangular)
and solve Ax = LUx = b by solving Ly = b for y and then Ux = y for x.
In the Gauss-Seidel iteration (Sec. 20.3) we make an = a22 = ... = ann = I
(by division) and write Ax = (I + L + U)x = b; thus x = b - (L + U)x, which
suggests the iteration formula
(2)
XCrn + I )
=b
- Lx(m+l) - UX Crn)
in which we always take the most recent approximate x./s on the right. If Ilell < l.
where C = -(I + L)-IU, then this process converges. Here. IICII denotes any
matrix norm (Sec. 20.3).
If the condition number K(A) = IIAII IIA -111 of A is large. then the system
Ax = b is ill-conditioned (Sec. 20.4), and a small residual r = b - Ax does 1Iot
imply that x is close to the exact solution.
The fitting of a polynomial p(x) = b o + blx ., ... + bmx'" through given data
(points in the J\"}·-plane) (Xl' YI), ... , (Xm Yn) by the method of least squares is
discussed in Sec. 20.5 (and in statistics in Sec. 25.9).
Eigenvalues A (values A for which Ax = Ax has a solution x =1= 0, called an
eigenvector) can be characterized by inequalities (Sec. 20.7), e.g. in Gerschgorin's
theorem, which gives 11 circular disks which contain the whole spectrum (all
eigenvalues) of A, of centers ajj and radii Llajkl (sum over k from I to 11, k =1= j).
Approximations of eigenvalues can be obtained by iteration. starting from an
Xo =1= 0 and computing Xl = Axo, x 2 = Ax!> ... , xn = AXn-i. In this power
method (Sec. 20.8) the Rayleigh quotient
(3)
q=
gives an approximation of an eigenvalue (usually that of the greatest absolute value)
and, if A is symmetric, an error bound is
(4)
Convergence may be slow but can be improved by a speCTral shift.
For determining all the eigenvalues of a symmetric matrix A it is best to first
rridiagonalize A and then to apply the QR-method (Sec. 20.9), which is based on a
factorization A = QR with 0I1hogonai Q and upper triangular R and uses similarity
transformations.
CHAPTER
21
>~-
Numerics for ODEs and PDEs
Numeric methods for differential equation" are of great practical importance to the
engineer and physicist because practical problems often lead to differential equations that
cannot be solved by one of the methods in Chaps. 1-6 or 12 or by similar methods. Also,
sometimes an ODE does have a solution fonnula (as the ODEs in Secs. 1.3-1.5 do), which,
however, in some specific cases may become so complicated that one prefers to apply a
numeric method instead.
This chapter explains and applies basic methods for the numeric solution of ODEs (Secs.
21.1-21.3) and POEs (Secs. 21.4-21.7).
Sections 21.1 alld 21.2 may be studied immediately after Chap. 1 alld Sec. 21.3
immediately after Chap. 2, because these sections are independent of Chaps. 19 and 20.
Sections 21.4-21.7 Oil PDEs may be studied immediately after Chap. 12 if students
have some knowledge of linear systems of algebraic equations.
Prerequisite: Secs. 1.1-1.5 for ODEs, Secs. 12.1-12.3, 12.5, 12.10 for POEs.
References and Answers to Problems App. 1 Part E (see also Parts A and C), App. 2.
21.1
Methods for First-Order ODEs
From Chap. 1 we know that an ODE of the first order is of the form F(x, y. y') = 0 and
can often be written in the explicit form y' = f(x. y). An initial value problem for this
equation is of the fonn
y' = f(x. yt
(1)
)'(xo) = Yo
where .1.'0 and Yo are given and we assume that the problem has a unique solution on some
open interval (/ < x < b containing .1.'0'
In thi~ section we shall discu~s methods of computing approximate numeric values of
the solution y(x) of (1) at the equidistant points on the x-axis
Xl
=
.1.'0
+
h.
where the step size h is a fixed number, for instance, 0.2 or O. I or 0.01. whose choice we
discuss later in this section. Those methods are step-by-step methods, using the same
formula in each step. Such formulas are suggested by the Taylor series
2
(2)
886
y(x
+ h)
= y(x)
+
hy' (x)
+
h
""2
y"(X) + ....
SEC 21.1
887
Methods for First-Order ODEs
For a small h the higher powers h 2 , h 3 ,
approximation
y(x
+ h)
.•.
= y(x)
= y(x)
are very small. This suggests the crude
+ hy' (x)
+ hf(.\", y)
(with the second line obtained from the given ODE) and the following iteration process.
In the first step we compute
+
Yl
=
Y(.\"o
+
h). In the second step we compute
which approximates )'(X2) = 1'(xo
+
2h), etc., and in general
which approximates Y(XI)
=
Yo
hf(xo, Yo)
(3)
(n
= 0, 1, .. ').
This is called the Euler method or the Euler-Cauchy method. Geometrically it is an
approximation of the curve of y(x) by a polygon whose first side is tangent to this curve
at xo (see Fig. 448).
y
Fig. 448.
Euler method
This crude method is hardly ever u~ed in practice, but since it is simple, it nicely explains
the principle of methods based on the Taylor series.
Taylor's formula with remainder has the foml
y(X
+ h) = y(x) + hy' (x) + ~h2y"W
(where x ~ l; ~ x + h). It shows that in the Euler method the tru/lcation error il1 each
step or local truncation error is propOitional to h 2 , written 0(11 2 ), where 0 suggests order
(see also Sec. 20.1). Now over a fixed x-interval in which we want to solve an ODE the
number of steps is proportional to 1111. Hence the total error or global error is propOitional
2
to h 0/h) = hI. For this reason. the Euler method is called a first-order method. In
addition, there are roundoff errors in this and other methods, which may affect the
accuracy of the values Yb .\'2, ..• more and more as 11 increases, as we shall see.
888
CHAP. 21
Table 21.1
Numerics for ODEs and PDEs
Euler Method Applied to (4) in Example 1 and Error
+
xn
)'n
0
I
0.0
0.000
0.000
0.2
0.4
0.6
0.8
1.0
0.000
0.040
0.128
0.274
0.489
0.040
2
3
4
5
E X AMP L E 1
0.2(xn
11
Exact
Values
)'n)
0.088
0.146
0.215
Error
En
0.000
0.021
0.000
0.092
0.222
0.426
n.7U<
0.052
0.094
0.152
0.229
O.02J
Euler Method
Apply the Euler method to the following initial value problem. choosing II = 0.2 and computing )'1 • . . .
y' =
(4)
Solution.
Here f(x.
y)
= x
+ y; hence
X
+ y,
yeO) = o.
f(x n , Yn) = xn
Yn+1
=
)"n
, \'5:
+ Yn' and we see that (3)
+ 0.2lxn
-j-
become~
Yn)'
Table 21.1 shows the computations. the values of the exact solution
y(x) = eX -
x-I
obtained from (4) in Sec. 1.5. and the error. Tn practice the exact solution is unknown, but an indication of the
accuracy of the values can be obtained by applying the Euler method once more with step 211 = 0.4. letting )'n*
denote the approximation now obtained. and comparing corresponding approximations. This computation is:
xn
vn *
0.0
0.4
0.8
0.000
0.000
0.160
O.4(Xn
+ )'n)
0.000
0.160
Yn in Table 21.1
Difference Yn - )'n*
0.000
0.040
0.274
0.000
0.040
0.1l4
Let En and En" be the errors of the computations with II and 211. respectively. Since the error is of order 112.
in a switch from II to 217 it is multiplied by 22 = 4, but since we need only half as many steps as before, it
will be multiplied only by 412 = 2. Hence En" = 2En SO that the difference is En * - En = 2En - En = En'
Now Y = )'n + En = Yn * + En * by the definition of error; hence En" - En = )'n - )'n* indicates En
qualitatively. Tn our computations, Y2 - )'2* = 0.04 - 0 = 0.04 (actual error 0.052. see Table 21.1) and
)"4 - )'4* = 0.274 - 0.160 = 0.114 (actually 0.152).
•
E X AMP L E 2
Euler Method for a Nonlinear ODE
Figure 449 concerns the initial value problem
(5)
yeO) = OA
and shows the curve of the solution y = 1/[2.5 - Sex)] + 0.01x 2 where Sex) is the Fresnel integral (38) in
App. 3.1. It also shows 80 approximate values for 0 ~ x ~ 4 obtained by the Euler method from (3).
Although Ii = 0.05 is smaller than Ii in Example 1, the accuracy is still not good. It is interesting that the error
is not monotone increasing. obviously since the solution is not monOlone. We shall return to this ODE in the
problem set.
•
SEC. 21.1
889
Methods for First-Order ODEs
y
0.70
0.60
0.50
0.40
o
Fig. 449.
2
3
4
x
Solution curve and Euler approximation in Example 2
Automatic Variable Step Size Selection in Modern Numeric Software
The idea of adaptive integration as motivated and explained in Sec. 19.5 applies equally
well to the numeric solution of ODEs. It now concerns automatically changing the step
size h depending on the variability of y' = f determined by
(6*)
Accordingly, modern software automatically selects variable step sizes h n so that the
error of the solution will not exceed a given maximum size TOL (suggesting tolerance).
Now for the Euler method. when the step size is h = h m the local error at Xn is about
~hn21y"(xn)l. We require that this be equal to a given tolerance TOL,
(6)
(a)
21"(Xn) 1=
I
2hn
)'
2TOL
thus
TOL,
Iy''lxn) 1
y"(X) must not be zero on the interval J: Xo ~ x = xN on which the solution is wanted.
Let K be the minimum of 1y" (X) 1on J and assume that K > O. Minimum 1,/'(x) 1corresponds
to maximum h = H = Y2 TOLIK by (6). Thus. Y2 TOL = HVK. We can insert this
into (6b). obtaining by straightforward algebra
K
(7)
where
For other methods, automatic step size selection is based on the same principle.
Improved Euler Method
By taking more terms in (2) into account we obtain numeric methods of higher order and
precision. But there is a practical problem. If we substitute y' = f(x, y(x)) into (2), we
have
(2*)
890
CHAP. 21
Numerics for ODEs and PDEs
Now y in .f depends on x, so that we have f' as shown in (6*) and .f", .fill even much
more cumbersome. The general strategy now is to avoid the computation of these
derivatives and to replace it by computing .f for one or several suitably chosen auxiliary
values of (x. y). "Suitably" means that these values are chosen to make the order of
the method as high as possible (to have high accuracy). Let us discuss two such methods
that are of practical imp0l1ance. namely. the improved Euler method and the (classical)
Runge-Kutta method.
In the improved Euler method or improved Euler-Cauchy method (sometimes also
called Heun method), in each step we compute first the auxiliary value
(8a)
and then the new value
(8b)
Thi~ method ha" a simple geometric interpretation. In fact. we may say that in the
we approximate the solution y by the straight line through
interval from x" to Xn +
(xn , Yn) with slope f(x n , Yn), and then we continue along the straight line with slope
f(X n +l, Y:;+I) until x reaches X,,+I'
The improved Euler-Cauchy method is a predictor-corrector method, because in each
step we first predict a value by (8a) and then correct it by (Sb).
In algorithmic form. using the notations kl = hf(xn, Yn) in (Sa) and k2 = hf(.'n+b Y~+I)
in (8b), we can write this method as shown in Table 21.2.
!h
Table 21.2
Improved Euler Method (Heun's Method)
ALGORITHM EULER (f, xo. Yo, h. N)
This algorithm compute<; the solution of the initial value problem y' = f(x. y) . .\"(xo) = Yo
at equidistant points xl = Xo + 17, X2 = Xo + 211, ... , XN = Xo + Nfl; here f is such
that this problem has a unique solution on the mterval [xo. xNl (see Sec. 1.7).
INPUT:
Initial values xo, Yo, step size II, number of steps N
OUTPUT: Approximation Yn+l to the solution Y(Xn +l) at
where 11 = 0, . . . , N - 1
For 11 =
o.
I. .... N - 1 do:
Xn+l = Xn
+ 11
kl = hf(xn, y,,)
k2
= hJ(Xn+b
Y,,+l =
y"
Yn
+ k1 )
+ 2(k1 + k2 )
OUTPUT Xn+l, ."n+l
End
Stop
End EULER
xn+l
=
Xo
+
(n
+
l)h.
SEC 21.1
891
Methods for First-Order ODEs
E X AMP L E 3
Improved Euler Method
Apply the improved Euler method to the initial vallie problem (4). choosmg h
Solution.
=
0.2. as before.
For the present problem we have in Table 21.2
k2
.1',,+1 = Yn +
=
0.2
""2
0.2(x" + 0.2 + "n
+ 0.2(xn + .1',,»
(2.2xn + 2.2Yn + 0.2) = Yn + 0.22(xn + .1',,) + 0.02.
Table 2 1.3 show~ thai ollr present results are more accurate than tho~e in Example I: see also Table 21.6. •
Table 21.3
Improved Euler Method Applied to (4) and Error
+ .1'n)
+ 0.02
U.22(xn
n
xn
•vn
0
0.0
0.2
0.4
0.6
0.8
1.0
0.0000
0.0200
0.0884
0.2158
0.4153
0.7027
2
3
4
5
Exact Values
(4D)
Error
0.0000
0.0214
0.0918
0.2221
0.4255
0.7183
0.0000
0.0014
0.0034
0.0063
0.0102
0.0156
0.0200
0.0684
0.1274
0.1995
0.2874
Error of the Improved Euler Method. The local error is of order h 3 and the globed
error of order h 2 , so that the method is a second-order method.
PROOF
Setting
In = Ierm .1'(xn » and using (r), we have
(9a)
Y(Xn
+
h) - .r(xn )
=
-
hI,.
1 2-'
1 3-"
+ :z./I
In + 6 h In + ....
Approximating the expression in the brackets in (8b) by
Taylor expansion, we obtain from (8b)
1 [)'n+l - )'n ="2h
In
_
(where'
1
[-
-"2 h In
(9b)
= dldx."
+
]
In+l
+
(fn
-
+
-,
1
1"
+ 1"+1
2 -If
hIn +"2h In
and again using the
+ ... )]
etc.). Subtraction of (9b) from (9a) gives the local error
h
3
_If
h
3
_If
11 3
-If
-6 .f n - -4 I n + ... -- - -12 .f" + ....
Since the number of steps over a fixed x-interval is proportional to I11z, the global error
3
2
•
is of order h /11 = h , so that the method is of second order.
892
CHAP. 21
Numerics for ODEs and PDEs
Runge-Kutta Methods (RK Methods)
A method of great practical importance and much greater accuracy than that of the
improved Euler method is the classical Runge-Kutta method of fourth order, which we
call briefly the Runge-Kutta method. 1 It is shown in Table 21.4. We see that in each
step we first compute four auxiliary quantities k 1, k2 , k 3 , k4 and then the new value Yn+1'
The method is well suited to the computer because it needs no special starting procedure,
makes light demand on storage, and repeatedly uses the same straightforward
computational procedure. It is numerically stable.
Note that if f depends only on x, this method reduces to Simpson's rule of integration
(Sec. 19.5). Note further that kJ, ... , k4 depend on 11 and generally change from step to
step.
Table 21.4
Classical Runge-Kutta Method of Fourth Order
ALGORITHM RUNGE-KUTTA (f, x o, )'0' h, N).
This algorithm computes the solution of the initial value problem y'
at equidistant points
Xl = Xo
+
h. X2
=
Xo
+
2h . . . . , XN = Xo
= f(x, y), y(Xo) = Yo
+ Nil:
here f is such that this problem has a unique solution on the interval [xo, XN] (see Sec. 1.7).
INPUT:
Function f, initial values xo, Yo, step size h, number of steps N
OUTPUT: Approximation Yn+1 to the solution 'y(Xn +1) at X,,+1
where n = 0, I, ... , N - 1
For n
=
= Xo
+
(n
+
l)h.
0, 1, ... , N - 1 do:
k1 = hf(xn, y,,)
+ ~h, Yn + ~kl)
hf(xn + ~h, Yn + ~k2)
hf(xn + h, Yn + k3)
= Xn + h
= Yn + ~(k1 + 2k2 + 2k3 +
k2 = hf(xn
k3 =
k4 =
Xn+1
Yn+1
k 4)
OUTPUT Xn +], )'1/+1
End
Stop
End RUNGE-KUTTA
INamed after the German mathematicians KARL RUNGE (Sec. 19.4) and WILHELM KUTTA (1867-1944).
Runge [Math. Annalen 46 (1895),167-178], KARL HEUN [Zeitschr. Math. Phys. 45 (1900), 23-38], and
Kutta [Zeitschr. Math. Phys. 46 1901l. 435-453] developed various such methods. Theoretically, there are
infinitely many fourth-order methods using four function values per step. The method in Table 21.4 is most
popular [rom a practical viewpoint because of its "symmetrical" form and its simple coefficients. It was given
by Kutta.
SEC 21.1
893
Methods for First-Order ODEs
E X AMP L E 4
Classical Runge-Kutta Method
Apply the Runge-Kutta method to the initial "alue problem (4) in Example 1, choosing II
computing five steps.
Solutioll.
For the present problem we have f(x. v}
x
=
+ y.
0.2, as before, and
=
Hence
kl = 0.2(xn
+ y,,),
k2 = 0.2(x"
+ 0.1 + y" + O.Sk l ),
k3 = 0.2(x"
+ 0.1 + Yn + 0.Sk2 }.
k4 = 0.2(x"
+ 0.2 + y" + k3 ).
Table 21.5 shows the results and their errors, which are smaller by factors 103 and 104 than those for the two
Euler methods. See also Table 21.6. We mention in passing that since the present k 1• . . . . k4 ate simple,
operations were saved by substituting kl into k2. then k2 into k3 . etc.: the resulting formula is shown in Column
•
4 of Table 21.5.
Table 21.5
Runge-Kutta Method Applied to (4)
n
x"
v
."
0
0.0
0.2
0.4
0.6
0.8
1.0
0
0.021400
0.091 818
0.222107
0.425521
0.718251
2
3
4
5
0.2214(x" + Yn)
+ 0.0214
bxact Values (6D)
y = eX - x-I
106 X Error
0.021400
0.070418
0.130289
0.203 414
0.292730
0.000000
0.021403
0.091825
0.222 119
0.425541
0.718282
0
3
7
12
:W
31
ofy"
Table 21.6 Comparison of the Accuracy of the Three Methods Under Consideration
in the Case of the Initial Value Problem (4), with h = 0.2
Error
!
x
r=ex-x-I
Euler
(Table 21.1)
Improved Euler
(Table 21.3)
Runge-Kutta
(Table 21.5)
0.2
0.4
0.6
0.8
1.0
0.021403
0.091825
0.222119
0.425541
0.718282
0.021
0.052
0.094
0.152
0.229
0.U014
0.0034
0.0063
0.0102
0.0156
0.000003
0.000007
0.000011
0.000020
0.000031
I
Error and Step Size Control. RKF
(Runge-Kutta-Fehlberg)
The idea of adaptive integration (Sec. 19.5) has analog<; for Runge-Kutta (and other)
methods. In Table 21.4 for RK (Runge-Kutta), if we compute in each step approximations
y and .17 with step sizes hand 2h. respectively, the latter has error per step equal to
5
2 = 32 times that of the former; however, since we have only half as many steps for 2h,
the actual factor is 2 5 /2 = 16. so that, say,
and thus
894
CHAP. 21
Numerics for ODEs and PDEs
Hence the error
E =
d- h ) for step size h is abour
(10)
where y problem
y = lh) -
y(2hl, as said before. Table 21.7 illustrates (10) for the initial value
y' =
(1L)
(y - x-l)2
+ 2,
yeo)
= I.
the step size h = 0.1 and 0 ~ x ~ 0.4. We see that the estimate is close to the actual
error. This method of error estimation is simple but may be unstable.
Table 21.7 Runge-Kutta Method Applied to the Initial Value Problem (11)
and Error Estimate (10). Exact Solution y = tan x + x + 1
\:
0.0
0.1
0.2
0.3
0.4
v
y
(Step size /7)
(Step si7e 211)
Error
Estimate (10)
Actual
Error
Exact
Solution (9D)
1.000 000 000
1.200 334 589
1.402 709 878
1.609 336 039
1.822 792 993
l.000 000 000
0.000000000
1.402 707 408
0.000 000 165
1.822 788 993
0.000000 267
0.000000000
0.000 000 083
0.000 000 157
0.000 000 210
0.000 000 226
1.000000000
1.200334672
1.402 710 036
1.609 336 250
l.822 793 219
RKF. E. Fehlberg [Computing 6 (1970), 61-71] proposed and developed error control
by using two RK methods of different orders to go from (xn , Yn) to (xn +1> Yn+l)' The
difference of the computed y-values at Xn+l gives an error estimate to be used for step
size controL Fehlberg discovered two RK formulas that together need only 6 function
evaluations per step. We present these formulas here because RKF has become quite
popular. For instance. Maple uses it (also for systems of ODEs).
Fehlberg's fifth-order RK method is
(12a)
with coefficient vector y = [Yl ... Y6].
(12b)
y=
U;5
o
28561
56430
6656
12825
i5]'
His fourth-order RK method is
(13a)
with coefficient vector
(13b)
Y * -_
[25
216
o
1408
2565
2197
4104
_1]
5'
SEC 21.1
895
Methods for First-Order ODEs
In both formulas we use only 6 different function evaluations altogether. namely.
(14)
k1
= hf(x."
k2
= hf(x., +
k3
= Izf(x n + ~h.
y.,
+
k4
= hf(xn + ~~h,
Yn
+ ~~~~kl
k5
= hf(x., + h.
J'n
+ i~~k1
k6
= hf(x., + ~h.
.\" n -
J'n)
!Iz,
Yn +
!k1)
~k1 +
i2k2)
~~g~k2 + ~~~~k3)
8k2 + 3:~~k3 - 481t~k4)
2k2 - ~~~k3 + ~~g:k4 - ~~k5)'
+
287 k1
The difference of (12) and (13) gives the error estimate
(15)
E X AMP L E 5
En+l -- .\'n+1 -
* --...Lk
Yn+1
360 1
128 k
4275 3
-
2197 k
75240 4
+..!.k
+..2..k
50 5
55 6'
Runge-Kutta-Fehlberg
For the initial value problem (II) we obtain from (12)-(14) with 11= 0.1 in the first step the
0.200062 500000
k1 = 0.200000 000000
k2
k3 = 0.200140756867
k4 = 0.2008.'i6926154
k5 = 0.201006676700
k6 = 0.200250418651
.\'1*
=
=
12S-value~
1.20033466949
.\'1 = 1.20033467253
and the error
e~timate
E1 =."1 -
.\'~ = 0.000000 00304.
The exact 12S-value is .1'(0.1) = 1.20033467209. Hence the actual error of."1 i~ -4.4· 10- 10, smaller than that
in Table 21.7 by a factor 200.
•
Table 21.8 summarizes essential features of the methods in this section. It can be shown
that these methods are Ilumerically stahle (definition in Sec. 19.1). They are one-step
methods because in each step we use the data of just one preceding step. in contrast to
multistep methods where in each step we use data from several preceding steps. as we
shall see in the next section.
Table 21.8
Methods Considered and Their Order (= Their Global Error)
Method
Euler
Improved Euler
RK (fourth order)
RKF
Function Evaluation
per Step
2
4
6
Global Error
Local Error
0(11)
0(17 2 )
2
0(h )
0(114)
0(h 5 )
0(h 3 )
0(11 5 )
0(h 6 )
896
CHAP. 21
Numerics for ODEs and PDEs
Backward Euler Method. Stiff ODEs
The backward Euler formula for numerically solving (I) is
(n = 0, 1, .. ').
(16)
This formula is obtained by evaluating the right side at the new location (Xn +1, )'n+1); this
is called the backward Euler scheme. For known )'n it gives )'n+l implicitly, so it defines
an implicit method, in contrast to the Euler method (3), which gives Yn+l explicitly.
Hence (16) must be solved for )'n+l' How difficult this is depends on f in (1). For a linear
ODE this provides no problem, as Example 6 (below) illustrates. The method is particularly
useful for "stiff' ODEs. as they occur quite frequently in the study of vibrations, electric
circuits. chemical reactions. etc. The situation of stiffness is roughly as follows; for details,
see, for example, [E5]. [E25], [E26] in App. I.
Error terms of the methods considered so far involve a higher derivative. And we ask
what happens if we let h illcrease. Now if the error (the derivative) grows fast but the
desired solution also grows fast. nothing will happen. However. if that solution does not
grow fast, then with growing h the error term can take over to an extent that the numeric
result becomes completely nonsensical. as in Fig. 450. Such an ODE for which h must
thus be restricted to small values, and the physical system the ODE models. are called
stiff. This term is suggested by a mass-:-.pring system with a stiff ~pring (spring with a
large k; see Sec. 2.4). Example 6 illustrates that implicit methods remove the difficulty
of increasing II in the case of stiffness: it can be shown that in the application of an implicit
method the solution remains stable under any increase of h, although the accuracy
decreases with increasing h.
E X AMP L E 6
Backward Euler Method. Stiff ODE
The initial value prublem
y'
= f(x. y) =
-20.,'
+
20x
2
+
2x.
y(O) = I
has the solution (verify!)
y =
e- 20x
+
x 2.
The backward Euler formula (16) is
),,,+1 =
Noting that xn+1
y"
+
hJ(xn+b Yn+1) = Yn
= X" + h. tak:ing the term
Yn
(l6*)
+
h( -20Yn+1
+ 20X~+1 +
2Xn +1)'
-20hY,,+1 to the left. and dividing. we obtain
+
h[20(xn
Yn+l =
+
h}2
+
2(x"
+ h)j
1 + 20h
The numeric re,ulr- in Table 21.9 show the following.
Stability of the backward Euler method for h = 0.05 and also for h = 0.2 with an error increase by about a
factOT 4 for h = 0.2.
Stability of the Euler method fOf h
= 0.05 but instability for h = 0.1
(Fig. 450).
Stability of RK for h = 0.1 but instability for h = 0.2.
1l1b illustrates that the ODE is stiff. Note thm even in the case of stdbility the appruximation of the ~olution
near x = 0 is poor.
•
Stiffness will be considered further in Sec. 21.3 in connection with systems of ODEs.
SEC. 21.1
897
Methods for First-Order ODEs
y
•
2.0
I
I
I
I
I
I
1.0
,
,
,
I
I
I
I
,I
o " lO.2~
:0.4', I.. 0.6
I
,
\
I
\:
-1.0
'I
1,
~:
0.8 4' LO x
'
"
•
Fig. 450. Euler method with h = 0.1 for the stiff
ODE in Example 6 and exact solution
Table 21.9
Backward Euler Method (BEM) for Example 6. Comparison with Euler and RK
BEM
x
h
0.05
=
1.00000
0.26188
0.10484
0.10809
0.16640
0.25347
0.36274
0.49256
0.64252
0.81250
1.00250
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
BEM
h = 0.2
Euler
h = 0.05
Euler
h = 0.1
1.00000
1.00000
0.00750
0.03750
0.08750
0.15750
0.24750
0.35750
0.48750
0.63750
0.80750
0.99750
1.00000
-1.00000
1.04000
-0.92000
1.16000
-0.76000
1.36000
-0.52000
1.64000
-0.20000
2.00000
0.24800
0.20960
0.37792
0.65158
1.01032
RK
h
=
RK
0.1
1.00000
0.34500
0.15333
0.12944
0.17482
0.25660
0.36387
0.49296
0.64265
0.81255
1.00252
h = 0.2
1.000
5.093
25.48
127.0
634.0
316H
Exact
1.00000
0.14534
0.05832
0.09248
0.16034
0.25004
0.36001
0.49001
0.64000
0.81000
1.00000
===== .... -........ ,,_ . . . . ..-..-.
=
- - ......- ............... -- .... ---... ..
11-41
EULER METHOD
7.
Do 10 steps. Solve the problem exactly. Compute the error.
(Show the details.)
1. y'
2. )"'
= y, yeO) =
=
y, yeO)
3. y' = (y 4.
y' = (y +
IS-lO I
=
.d.
X)2.
1. h = 0.1
1. h = 0.01
yCO) = 0, h
yCO) = o. h
1. h
6. (Logistic population»)."
=
=
1, h = O. I
8. y' + )' tan x = sin 2x, yeO) = I, h = 0.1
9. Do Prob. 7 using the Euler method with h = 0.1 and
=
0.1
=
0.1
111-171
0.1. Compare with Prob I
y - )'2, yeO)
=
0.2, h
=
0.1
h 2 ,y(0)=0,h=0.1
CLASSICAL RUNGE-KUTTA METHOD
OF FOURTH ORDER
Do 10 steps. Compare as indicated. Comment. (Show the
details. )
11.
y' - xy2 = O• .1'(0) = 1, h = 0.1. Compare with
Prob. 7. Apply (10) to )'10'
12. y'
=
yeo)
compare the accuracy.
Do 10 steps. Solve exactly. Compute the error. (Show the
details.)
=
xy2 = O.
10./ = I
IMPROVED EULER METHOD
S. y' = y. -,,(0)
and comment.
y' -
= y - y2 • .1'(0) = 0.2, h = 0.1. Compare with
Prob. 6. Apply (10) to )'10'
898
13.
14.
15.
CHAP. 21
Numerics for ODEs and PDEs
(b) Graph solution curves of the ODE in (5) for
various positive and negative initial values.
(c) Do a similar experimem as in (a) for an initial
value problem that has a monotone increasing or
monotone decreasing solution. Compare the behavior
of the error with that in (a). Comment.
y' = (l + X-I»),. yO) = e. h = 0.2
y' = !U'lx - xl),). ),(2) = 2. h = 0.2
y' + -" tan x = sin 2x, yeo) = 1. II = 0.1
16. In Prob. 15 use h = 0.2 (5 steps) and compare the error.
17. y' + 5x 4 y2 = 0, yeO) = 1. h = 0.2
20. CAS EXPERIMENT. RKF. (a) W]ite a program for
RKF that gives .tn' y". the estimate (10). and if the
18. Kulla's third-order method is defined by
Yn+l = y" + irk] + 4k2 + k3*) with kl and k2 as in
RK (Table 21.4) and k3* = 1If(xn +].),,, - k] + 2k2 ).
Apply this method to (4) in Example I. Choose
h = 0.2 and do 5 steps. Compare with Table 21.6.
19. CAS EXPERIMENT. Euler-Cauchy vs. RK.
(a) Solve (5) in Example 2 by Euler. Improved Euler.
and RK for 0 ~ x ~ 5 with step h = 0.2. Compare the
errors for x = 1,3,5 and comment.
solution is known. the actual error
En'
(b) Apply the program to Example 5 in the text
(10 steps, 11 = 0.1).
(c) E" in (b) gives a relatively good idea of the size
of the actual error. Is this typical or accidental? Find
out by experimentation with other problems on what
properties of the ODE or solution this might depend.
21.2 Multistep Methods
In a one-step method we compute )'n+1 using only a single step, namely, the previous
value y.,. One-step methods (Ire "self-starting," they need no help to get going because
they obtain ."1 from the initial value ."0' etc. All methods in Sec. 21.1 are one-step.
In contrast, a multistep method uses in each step values from two or more previous
steps. These methods are motivated by the expectation that the additional information will
increase accuracy and stability. But to get staJ1ed, one needs values. say. YO,."b ."2'."3 in
a 4-step method, obtained by Runge-Kutta or another accurate method. Thus, multistep
methods are not self-starting. Such methods are obtained as follows.
Adams-Bashforth Methods
We consider an initial value problem
(1)
y'
=
f(x, y),
as before. with f such that the problem has a unique solution on some open interval
containing .1'0' We integrate y' = f(x, y) from x" to xn+1 = Xn + h. This gives
Now comes the main idea. We replace f(x. y(x» by an interpolation polynomial p(x) (see
Sec. 19.3), so that we can later integrate. This gives approximations )'n+1 of y(xn +1) and
Yn of y(xn ),
x n+ 1
(2)
)'n+1
= )'n +
J
x"
p(x) dr.
SEC. 21.2
899
Multistep Methods
Different choices of p(x) will now produce different methods. We explain the principle
by taking a cubic polynomial, namely, the polynomial P3(x) that at (equidistant)
has the respective values
In
= I(xn , Yn)
(3)
This will lead to a practically useful formula. We can obtainp3(x) from Newton's backward
difference formula (18), Sec. 19.3:
where
x - Xn
h
r=
We integrate p3(X) over x from Xn to Xn+l
x = Xn
The integral oqdr
(4)
I
x".
Xn
+
+ hr.
= Xn + 11. thus over r from 0 to
we have
1) is 5/12 and that ofir(r
1
= 11 dr.
+ l)(r + 2) is 3/8. We thus obtain
(1~VIn+ 5
P3dX=lzlp3dr=h In+
0
dx
1. Since
~)
- V2In+ ':"V- 3fl1 .
2
12
8
It is practical to replace these differences by their expressions in terms of I:
\In
V2In
= In - In-l
= In - 2In-l + 1",-2
We substitute this into (4) and collect terms. This gives the multistep formula of the
Adams-Bashforth method of fourth order
(5)
)'n+l = Yn
+
h
24 (55In - 59In-l
+ 37In-2
- 9In-3)'
It expresses the new value Yn+ I [approximation of the solution y of (I) at Xn+ 1] in terms
of 4 values of.f computed from the y-values ohtained in the preceding 4 steps. The local
truncation error is of order 11 5 , as can be shown, so that the global error is of order 114;
hence (5) does define a fourth-order method.
900
CHAP. 21
Numerics for ODEs and PDEs
Adams-Moulton Methods
Adams-Moulton methods are obtained if for p(x) in (2) we choose a polynomial that
interpolates f(x, .v(x» at x n +l' x n , Xn_1> ... (as opposed to Xn , X"'-I' ... used before; this
is the main point). We explain the principle for the cubic polynomial P3(X) that interpolates
at X n +l' X'" Xn -l' X.,-2' (Before we had x n ' Xn -l, X,,-2' X.,-3') Again using (18) in
Sec. 19.3 but now setting r = (x - x n +1)lh, we have
We now integrate over x from Xn to Xn+l as before. This corresponds to integrating over
r from -I to O. We obtain
Replacing the differences as before gives
(6)
Yn-ll
=
Yn
+
f
lz
Xn+l
P3(X) dx
=
Yn
+ 24
(9f"+1
+ 19fn - 5f,,-1 +
fn-2)'
Xu
This is usually called an Adams-Moulton formula. It is an implicit formula because
fn+l = f(X,,+b )'n+l) appears on the right, so that it defines Yn+l only implicitly, in
contrast to (5), which is an explicit formula, not involving Yn+l on the right. To use (6)
we must predict a value y ~;+ I> for instance, by using (5), that is,
(7a)
Y~+1
= Yn +
h
24 (55fn - 59fn-l
+ 37fn-2
- 9fn-3)'
The corrected new value Yn+ 1 is then obtained from (6) with f n+ 1 replaced by
f~+1 = J(x n +1, Y~+1) and the other fs as in (6); thus,
(7b)
h
Yn+1 = Yn
+ 24
(9f~+1
+
19fn - 5fn-l
+
f n-2)'
This predictor-corrector method (7a). (7b) is usually called the Adams-Moulton
method offourth order. It has the advantage mer RK that (7) gives the error estimate
as can be shown. This is the analog of (10) in Sec. 21.1.
Sometimes the name 'Adams-Moulton method' is reserved for the method with several
corrections per step by (7b) until a specific accuracy is reached. Popular codes exist for
both versions of the method.
Getting Started. In (5) we need fo, fl, f2, f3' Hence from (3) we see that we must first
compute Yr. -"2, .1'3 by some other method of comparable accuracy, for instance, by RK or
by RKF. For other choices see Ref. [E26] listed in App. I.
SEC. 21.2
901
Multistep Methods
E X AMP L E 1
Adams-Bashforth Prediction (7a), Adams-Moulton Correction (7b)
Solve the initial value problem
y'
(8)
=
T
+ y,
by (7a), (7b) on the interval 0 ::;; x ::;; 2. choosing h
=
y(O) =
0
0.2.
Solutio". The problem is the same as in Examples 1-3. Sec. 21.1. so that we can compare the results. We
compute starting values Yl . .'"2• .'"3 by the classical Runge-Kutta method. Then in each step we predict by (7a) and
make one correction by (7b) before we execute the next step. The result~ are shown and compared with the exact
values in Table 21.10. We see that the corrections improve the accuracy considerably. This is typicaL
•
Table 21.10 Adams-Moulton Method Applied to the Initial Value Problem (8);
Predicted Values Computed by (7a) and Corrected values by (7b)
Il
Xn
0
1
2
3
4
5
6
7
8
9
10
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
2.0
106 • Error
Yn
Exact
Values
0.425529
0.718270
1.120106
1.655191
2.353026
3.249646
4.389062
0.000 000
0.021403
0.091825
0.222119
0.425541
0.718282
1.120117
1.655200
2.353032
3.249647
4.389056
0
3
7
12
12
12
11
9
6
1
-6
Starting
Predicted
Corrected
Yn
Yn*
0.000000
0.021 400
0.091 818
0.222 107
0.425361
0.718066
1.119855
1.654885
2.352653
3.249 190
4.388505
ofYn
Comments on Comparison of Methods. An Adams-Moulton formula is generally
much more accurate than an Adams-Bashforth formula of the same order. This justifies
the greater complication and expense in using the former. The method (7a), (7b) is
Ilumerically stable, whereas the exclusive use of (7a) might cause instability. Step size
control is relatively simple. If ICorrector - Predictorl > TOL, use interpolation to generate
"old" results at half the current "tep size and then try /1/2 as the new step.
Wherea~ the Adams-Moulton fOIillula (7a), (7b) needs only 2 evaluations per step,
Runge-Kutta needs 4; however, with Runge-Kutta one may be able to take a step size
more than twice as large, so that a comparison of this kind (widespread in the literature)
is meaningless.
For more details, see Refs. [E25], LE26] listed in App. 1.
........
-.. -.. ,,_
...... ..-...... -....
......-.............
--...
1. Carry out and show the details of the calculations
leading to (4 H7) in the text.
12-111
ADAMS-MOULTON METHOD (7a), (7b)
Solve the initial value problems by Adams-Moulton. 10 steps
with I correction ~r step. Solve exactly and compute the
enOL (Use RK where no starting values are given.)
2.),' = J, yeO) = I, II = 0.1 (1.105171, 1.221403,
1.349859)
= -0.2xy. yeO) = I, h = 0.2
4. y' = 2xy, yeo) = I. h = 0.1
5. y' = I + y2, )"(0) = O. h = 0.1
3. y'
6. Do Prob. 4 by RK, 5 steps, h = 0.2. Compare the errors.
CHAP. 21
902
Numerics for ODEs and PDEs
solution and comment.
7. Do Prob. 5 by RK. 5 steps. h = 0.2. Compare the errors.
8.
9.
14. How much can you reduce the error in Prob. 13 by
halving II (20 steps. h = 0.05)? First guess, then
compute.
y' = xly. y(l) = 3, h = 0.2
y' = (x + Y - 4)2, y(O) = 4, h = 0.2. only 7 steps
(why?)
15. CAS PROJECT. Adams-Moulton. (a) Accurate
starting is important in (7a), (7b). Illustrate this in
Example I of the text by using starting values from the
improved Euler-Cauchy method and compare the
results with those in Table 21.9.
(b) How much does the error in Prob. 11 decrease if
you use exact starting values (instead of RK-values)?
(c) Experiment to find out for what ODEs poor
starting is very damaging and for what ODEs it is not.
(d) The classical RK method often gives the same
accuracy with step 211 as Adams-Moulton with step
h. so that the total number of function evaluations is
the same in both cases. Illustrate this with Prob. 8.
(Hence corresponding comparisons in the literature in
favor of Adams-Moulton are not valid. See also
Probs. 6 and 7.)
y' = I - 4.\'2, y(O) = O. II = 0.1
11. y' = x + y . .1'(0) = O. h = 0.1 (0.00517083.
10.
0.0214026.0.04(8585)
12. Show that by applying the method in the text to a
polynomial of second degree we obtain the predictor
and corrector formulas
II
Y~+1 = Yn
+ 12
Yn+l = Yn
+ 12
(23fn -
h
(5fn+l
16f,,_1
+
+
5f,,-2)
8fn - fn-l)'
13. Use Prob. 12 to solve y' = 2x.\". y(O) = I (10 steps,
It = 0.1, RK starting values). Compare with the exact
21.3
Methods for Systems
and Higher Order ODEs
Initial value problems for first-order
(1)
y'
=
system~
f(x.
of ODEs are of the form
y).
in components
f is assumed to be such that the problem has a unique solution y(x) on some open x-interval
containing Xo. Our discussion will be independent of Chap. 4 on system~.
Before explaining solution methods it is important to note that (l) includes initial value
problem:-- for single mth-order ODEs,
(2)
y'>n)
= f(x, y, y',
and initial conditions )'(xo) = Kb
y' (xo)
.v"..... y(Tn-D)
= K 2 , ••• , y<m-l)(xo) = Km as special cases.
SEC. 21.3
Methods for Systems and Higher Order ODEs
903
Indeed, the connection is achieved by setting
(3)
)' = _
1: ",
I
_\. 1 = _\' ,
_3
)'2 =)'
Then we obtain the system
,
)'1 =)'2
(4)
,
Y'111-1 ~ Y·uz.
Y~t
= I(x.
Yl ....• Ym)
Euler Method for Systems
Methods for single first-order ODEs can be extended to systems (1) simply by writing vector
functions Y and f instead of scalar functions y and I, whereas x remains a scalar variable.
We begin with the Euler method. Just as for a single ODE this method will not be
accurate enough for practical purposes, but it nicely illustrates the extension principle.
E X AMP L E 1
Euler Method for a Second-Order ODE. Mass-Spring System
Solve the initial value problem for a damped mass-spring system
y"
+ 2/ +
0.75.1' = 0,
1'(0)
= 3.
/(0) = -2.5
by the Euler method for systems with step h = 0.2 for x from 0 to I (where x is time).
Solutioll.
The Euler method (3). Sec. 21.1. generalizes to systems in the form
(5)
in
Yn+l
= Yn + hf(xn' Yn),
componelll~
and similarly for systems of more than two equations. By (4) the given ODE converts to the system
y~
=
y~
= .f2(X• ."1' .1'2) = -2Y2 - 0.75.\"1'
h(x. Yl' .1'2)
= Y2
Hence (5) becomes
Yl.n+l = )'1,11.
+ O.2Y2.n
)'2.n+l = )'2,n
+
0.2(-2)'2,11. - 0.75Yl,n)'
The initial conditions are y(O) = )'1(0) = 3, -'" (0) = Y2(O) = -2.5. The calculation~ are shown in Table 21.11
on the next page. As for single ODEs. the results would not be accurate enough for practical purposes. The
example merely serves to illustrate the method because the problem can be readily solved exactly.
thu~
•
904
CHAP. 21
Table 21.11
Numerics for ODEs and PDEs
Euler Method for Systems in Example 1 (Mass-Spring System)
."1 Exact
.1" 1.71
0
2
3
4
0.0
0.2
0.4
0.6
0.8
5
1.0
(5D)
3.00000
2.50000
2.11000
1.80100
1.55230
1.34905
buor
.\'2 Exact
)"2.11
= )"1 - )"I.n
EI
3.00000
2.55049
2.18627
1.88821
1.64183
1.43619
-2.50000
- 1.95000
-1.54500
-1.24350
-1.01625
-0.84260
0.00000
0.05049
0.76270
0.08721
0.08953
0.08714
Euor
(5D)
E2
-2.50000
-2.01606
-1.64195
-1.35067
-1.12211
-0.94123
= )"2
-
)"2."
0.00000
-0.06606
-0.09695
-0.10717
-0.10586
-0.09863
Runge-Kutta Methods for Systems
As for Euler methods, we obtain RK methods for an initial value problem (1) simply by
writing vector formulas for vectors with III components, which for 111 = 1 reduce to the
previous scalar formulas.
Thus for the classical RK method of fourth order in Table 21.4 we obtain
(6a)
(Initial values)
and for each step
11
= 0, 1, ... , N
- 1 we obtain the 4 auxiliary quantities
ki
=
II f(x n ,
k2
=
hf(xn
+
!h,
Yn
+ !kI )
k3
=
hf(xn
+
!h,
Yn
+ !k2)
Yn)
(6b)
and the new value [approximation of the solution y(x) at x n + 1 = Xo
+
(n
+ I)h1
(6c)
E X AMP L E 2
RK Method for Systems. Airy's Equation. Airy Function Ai(x)
Solve the initial value problem
y" =
X-,"~
.1"(0) ~ 1/(32/3. rO/3J) = 0.35502805.
/(0) = -1I(3113·r(lf3)) = -0.25881940
by the Runge-Kutta method for systems with h = 0.2: do 5 steps. This is Airy's equation,2 which arose in
optics (see Ref. [Al3]. p. 188. listed in App. I). r is the gamma function (see App. A3.11. The initial conditions
are such that we obtain a standard solution, the Airy function Ai(x), a special function that has been thoroughly
investigated: for numeric values. see Ref. [GRI], pp. 446. 475.
2Named after Sir GEORGE BIDELL AIRY (1801-1892), English mathematician. who is known for his work
in elasticity and in PDEs.
SEC. 21.3
905
Methods for Systems and Higher Order ODEs
Solution.
For y"
=
xy.
setting YI = Y,)"2 =
yi =
v' we obtain the system (4)
,
)"1 =)"2
Hence f = [h f2]T in (1) has the components hex. y) = .\'2' f2(X. Y) = XYI' We now write (6) in
components. The initial conditions (6a) are YI.O = 0.35502805. )"2.0 = -0.25881 940. In 16b) we have
fewer subscripts by simply writing kl = a, k2 = b. k3 = C, ~ = d. so that a = [01 02]T. etc. Then (6b)
takes the form
a=h
Y2,n
[
]
X·n Yl.1l
(6b*)
For example. the second component of b
Now in b (= k 2 ) the first argument is
i~
obtained as follows. fIx. y) has the second component f2(x, y) =
x = x"
+ lh.
Y = Yn
+ !a.
X\'I'
The second argument in b is
and the first component of this is
Together.
Similarly for the other components in (6b*). Finally.
(6c*)
Yn+! = Yn
+ !(a +
2b
+ 2c + d).
Table 21.12 shows the values y(x) = YI(x) of the Airy function Ai(x) and of its derivative
as of the (rather small!) error of y(x).
y' (x)
= )'2(x)
a~ well
•
Table 21.12 RK Method for Systems: Values Yl,n(X n) of the Airy Function Ai(x)
in Example 2
II
Xn
Yl,n(X n )
Yl(X n ) Exact (8D)
108 • Error of .V I
Y2,n(Xn )
0
0.0
0.2
0.4
0.6
0.8
1.0
0.35502805
0.30370303
0.25474211
0.20979973
0.16984596
0.13529207
0.35502805
0.30370315
0.25474235
0.20980006
0.16984632
0.13529242
0
12
24
33
36
35
-0.25881 940
-0.25240464
-0.23583 073
-0.21279 185
-0.18641 171
-0.15914687
2
3
4
5
906
CHAP. 21
Numerics for ODEs and PDEs
Runge-Kutta-Nystrom Methods (RKN Methods)
RKN methods are direct extensions of RK methods (Runge-Kutta methods) to secondorder ODEs y" = f(x, y. y'), as given by the Finnish mathematician E. J. Nystrom [Acta
Soc. Sci. fenn., 1925, L, No. 131. The best known of these uses the following formulas,
where 11 = 0, I, ... , N - I (N the number of steps):
kl
~h.f(xn' -".", y~)
=
k2 = ~hf(xn
(7a)
+
~h.
)'n
+ K. y~ +
k3
= ~Izf(xn + ~h . ."n +
k4
=
~hf(xn + h, ."n
K.
k1)
where K
=
~h (y~
where L
=
h (y~
+ ~kl)
Y:L + k 2)
+ L, .":1 +
2k3)
From this we compute the approximation )'n+l of Y(Xn+lJ at Xn+l
=
+
Xo
k3)'
+
(n
+
l)h,
(7b)
and the approximation y~+ 1 of the derivative y' Ltn+ 1) needed in the next step.
(7c)
=
RKN for ODEs y"
f(x, y) Not Containing y'. Then k2
the method particularly advantageous and reduces (7) to
(7*)
=
kl
= ~Izf(xn' Yn)
k2
= ~hf(xn + ~h. Yn + ~h(y~ + ~kl» =
k4
=
Yn+l
~hf(xn
+
h, Yn
+
h(Y:1
= Yn + h(y~ + !(k1 +
+
k3 in (7), which makes
k3
k 2)
2k2 »
Y~+1 = Y:1 + -!(k1 + 4k2 + k 4 )·
E X AMP L E 3
RKN Method. Airy's Equation. Airy Function Ai(x)
For the problem
k2
=
k3
In
=
Example 2 and"
O.l(xn
+
0.))(.\""
+
=
0.2 as betore we obtain from (Y) silllply
O.IY;l
+ 0.05k1 !.
k4
=
O.I(xn
+
"1
0.2)(y"
= O.I.~nYn
and
+ O.2y~ + 0.2k2)·
Table 21.13 ,how, the result,. The accuracy is the 'ame as in Example 2. but the work wa' much less.
•
Table 21.13 Runge-Kutta-Nystrom Method Applied to Airy's Equation, Computation of
the Airy Function y = Ai(x)
,
Xn
Yn
\'
,n
0.0
0.2
0.4
0.6
0.8
0.35502805
0.303 703 04
0.254742 II
0.20979974
0.16984599
0.135292 18
-0.258 81940
-0.252404 64
-0.235 830 70
-0.21279172
-0.18641134
-0.15914609
LO
y(x) Exact (8D)
0.35502805
0.303 703 15
0.25474235
0.209800 06
0.16984632
0.13529242
lOB. Error
of -"n
0
II
24
32
33
24
I
SEC. 21.3
907
Methods for Systems and Higher Order ODEs
Our work in Examples 2 and 3 also illustrates that usefulness of methods for ODEs in the
computation of values of "higher transcendental functions."
Backward Euler Method for Systems. Stiff Systems
The backward Euler formula (16) in Sec. 21.1 generalizes to systems in the form
(8)
+
Yn+l = Yn
h f(Xn+l'
(11
Yn+l)
= O. I ... ').
This is again an implicit method. giving Yn+l implicitly for given Yn- Hence (8) must be
solved for Yn+l' For a linear system this is shown in the next example. This example also
illustrates that. similar to the case of a single ODE in Sec. 21.1, the method is very useful
for stiff systems. These are systems of ODEs whose matrix has eigenvalues A of very
different magnitudes, having the effect that, just as in Sec. 21.1, the step in direct methods,
RK for example, cannot be increased beyond a certain threshold without losing stability.
(A = -I and -10 in Example 4, but larger differences do occur in applications.)
E X AMP L E 4
Backward Euler Method for Systems of ODEs. Stiff Systems
Compare the backward Euler method (8) with the Euler and the RK methods for numerically solving the initial
value problem
y" + 11/ + lOy
= lOx
+ II.
)"'(0) =
"(0) = 2.
-10
converted to a system of first-order ODEs,
Solution.
The given problem "an easily be '<JIved, obtaining
y = e- x
+ e- lOx + x
so that we can compute errors, Conversion to a system by setting Y = Yl.)"' = Y2 [see (4)] gives
,
Yl
)"1(0) =
=)"2
y~ = -IOYI - 11.'"2
+
lOx
+ II
2
)"2(0) = -10,
The coefficient matrix
-A
has the characteristic determinant
whose value is A2 + IIA + lO = (A
The backward Euler formula is
Yn+l
=
.'"I,n+lJ
[
=
."2.n+l
+
I)(A
I-10
+ 10), Hence the eigenvalues are
-I and -lO as claimed above.
_"I'''J
[
"2,n+l
J
[ Y2.11 + Ii -10)"I,n+1 - IlY2,n+l + lOXn+l + II -
Reordering terms gives the linear system in the unknowns )"I,n+l and )"2.n 11
11)'2,n+l
)"1.,,+1 -
IOh)'1.,,+1
+
(1
+ 1 Ih)Y2_11+ 1
=
)"I,n
= )'2;n
+ 1011!-,,, +
h) -t
1111.
The coefficient determinant is D = I + IIIz + IOh 2 , and Cramer's rule (in Sec 7_6) gives the solution
1I1i)Yl.n
+
hY2,n
+
10112xn
+
11112
-lOhY1,,,
+
Y2,n
+
lOlnn
+
1111
0 +
Yn+l =
D
[
+
3
1011 ] .
+ 1011 2
CHAP. 21
908
Numerics for ODEs and PDEs
Table 21.14
Backward Euler Method (BEM) for Example 4. Comparison with Euler and RK
BEM
h = 0.2
x
h
BEM
= 0.4
0.0
0.2
2.00000
1.36667
2.00000
0.4
0.6
0.8
1.0
1.2
1.4
1.20556
1.21574
1.29460
1.40599
1.53627
1.67954
L.83272
1.99386
2.16152
1.31429
l"
1.8
2.0
Euler
= 0.1
Euler
h = 0.2
RK
RK
h = 0.2
Iz = 0.3
2.00000
1.01000
1.56100
1.13144
1.23047
1.34868
1.48243
1.62877
1.78530
1.95009
2.12158
2.00000
0.00000
2.04000
0.11200
2.20960
0.32768
2.46214
0.60972
2.76777
0.93422
3.10737
2.00000
1.35207
1.18144
1.18585
1.26168
1.37200
1.50257
1.64706
1.80205
1.96535
2.13536
2.00000
h
1.35020
1.57243
1.86191
2.18625
Exact
2.00000
L.l5407
L.08864
1.15129
1.24966
1.36792
1.50120
1.64660
1.80190
1.96530
2.13534
3.03947
5.07561}
8.72329
Table 21.14 shows the following.
Stability of the backward Euler method for h
accuracy for increasing Ii.
~
Stability of the Euler method for Ii
Stability of RK for h
=
~
0.2 and 0.4 (and in fact for any Ii; try Ii = 5.0) with decreasing
0.1 but instability for h
~
0.2.
0.2 but instability for II = 0.3.
Figure 451 show. the Euler method for Ii ~ 0.18, an interesting case with initial jumping (for about x < 3) but
later monotone following the solution curve of y = Y1' See also CAS Experiment 21.
•
y
..
4.0
-..-
3.0
~
.-
2.0
1.0
0
Fig. 451.
-===== -.•. -.. -. -....
•
_
-
~
4. Y~
Y2, Y1(0)
=
x
0.18 in Example 4
=
I, )'2(0)
=
Yb Y; = -J'2, .VI(O)
=
2, Y2(0)
=
2, h
= 0.1,
10 steps
5. y"
2. y~ = -3)"1 + )"2' y~ = Y1 - 3)"2' )"1(0) = 2, )"2(0) = 0,
h = 0.1,5 steps
=
4
3
5 steps
EULER FOR SYSTEMS
AND SECOND-ORDER ODES
Solve by the Euler method:
Y;
2
........
L~
)"1,
•
Euler method with h
1. Verify the calculations in Example I.
3. y~ =
I
\I
\I
. . .-..--
.. _ - . · . . . . . . . A -~
__ .....
_
"
""
"I
=
-1, h = 0.2,
+
4y =
0, yeO) = 1, y'(O) = 0, h
0.2,
1, y' (0)
0.1,
5 steps
6. y" - )'
5 steps
x, nO)
-2, h
SEC. 21.4
909
Methods for Elliptic PDEs
7. y~
= -.\"1 + Y2' Y; = -.\"1 - Y2, ."1(0) =
.\"2(0) = 4, h = 0.1, 10 steps
o.
17. ,," -
i,
~ -141
RK FOR SYSTEMS
Solve by the classical RK:
9. The system in Prob. 7. How much smaller is the error?
10. The ODE in Prob. 6. By what factor did the error
decrease?
11. Undamped Pendulum. Y" + siny = 0, yt77) = O.
y' (77) = 1. h = 0.2, 5 steps. How doe~ your result fit
into Fig. 92 in Sec. 4.5?
12. Bessel Function 10 , xy" + y' + xy = 0,
y(l) = 0.765198,/ (I) = -0.-1-40051, h = 0.5,5 steps.
(This gives the standard solution 10(x) in Fig. 107 in
Sec. 5.5.)
13•
.r ~
=
-4Y1
+ )'2'
y~ =
.\'1 -
4)'2'
YI(O) = 0,
Y2(0) = 2, h = 0.1, 5 steps
14. The system in Prob. 2. Ho'W much smaller is the error?
15. Verify the calculations for the Airy equation in
Example 3.
fi.-I91
RUNGE-KUTTA-NYSTROM METHOD
=
-
3, ,,' (0)
6x 2 -+ 3.)
(x
2
I -
=
!
0,
In 2.
20. CAS EXPERIMENT. Comparison of Methods. (a)
Write program~ for RKN and RK for systems.
(b) Try them out for second-order ODEs of your
choice to find out empirically which is better in specific
cases.
(c) In using RKN, would it pay to first eliminate y'
(see Prob. 29 in Problem Set 5.5)? Find out
experimentally.
21. CAS
EXPERIMENT. Backward Euler
Stiffness. Extend Example 4 as follows.
and
(a) Verify the values in Table 21.14 and show them
graphically as in Fig. 451.
(b) Compute and graph Euler values for h near the
"critical" h = 0.18 to determine more exactly when
instability starts.
(c) Compute and graph RK values for values of h
between 0.2 and 0.3 to find h for which the RK
approximation begins to increase away from the exact
solution.
(d) Compute and graph backward Euler values for
large h: confirm stability and investigate the error
increase for growing h.
Do by RKN:
16. Prob. 12 (Bessel function Jo). Compare the results.
21.4
n" + 4,' = 0, ,'(0)
0.2: 5 steps '(Exact: y '= x4
- x)y" - xy' + y = 0, y(!) =
y'(!) = I - In 2, h = 0.1. 4 steps
19. Prob. II. Compare the results.
18.
8. Verify the formulas and calculations for the Airy
equation in Example 2.
=
Methods for Elliptic PDEs
The remaining sections of this chapter are devoted to numerics for PDEs (partial
differential equations), particularly for the Laplace, Poisson, heat, and wave equations.
These POEs are basic in applications and, at the same time, are model cases of elliptic,
parabolic, and hyperbolic PDEs, respectively. The definitions are as follows. (recall also
Sec. 12.4).
A POE is called quasilinear if it is linear in the highest derivatives. Hence a secondorder quasilinear PDE in two independent variables x, y is of the form
(I)
1I is an unknown function of x and y (a solution sought). F is a given function of the
indicated variables.
Depending on the discriminant ac - b 2 , the PDE (I) is said to be of
elliptic type
if
ac - b 2 > 0
(example: Lap/ace e£juation)
parabolic type
if
ae - b 2 = 0
(example: beat equation)
hyperbolic type
if
lie - b 2 < 0
(example: wal'e equation).
910
CHAP. 21
Numerics for ODEs and PDEs
Here. in the heat and wave equations, y is time t. The coefficients G, b, c may be functions
of x, y. so that the type of (I) may be different in different regions of the xy-plane. This
classification is not merely a formal matter but is of great practical importance because
the general behavior of solutions differs from type to type and so do the additional
conditions (boundary and initial conditions) that must be taken into account.
Applications involving elliptic equatio1ls usually lead to boundary value problems in a
region R. called a first boundary value problem or Dirichlet problem if u is prescribed
on the boundary curve C of R, a second bOllndary I'llllle problem or Neumann problem
if lln = au/all (normal derivative of lI) is prescribed on C. and a third or mixed problem
if II is prescribed on a part of C and Un on the remaining part. C usually is a closed curve
(or sometimes consists of two or more such curves).
Difference Equations for the Laplace and
Poisson Equations
In this section we consider the Laplace equation
(2)
and the Poisson equation
(3)
These are the most important elliptic PDEs in applications. To obtain methods of numeric
solution. we replace the pmtial derivatives by conesponding difference quotients, as
follows. By the Taylor formula,
(4)
(b)
We subtract (4b) from (4a), neglect terms in h3 , 114, ..• , and solve for
(5a)
ux(x, y) =
1
21z [u(x
+ ", y)
-
llx.
Then
h. y)].
lI(x -
Similarly,
u(x. y
+ k)
= u(x. y)
+
/...uy(x, y)
+ ~k2Uyy(X, y) + .. -
and
u(x, y - k)
=
lI(X. y) -
kuy{.r. y)
+ ~k2Uyy(x, y) + - ...
By subtracting, neglecting tenns in k 3 , k4, ... , and solving for u y we obtain
~
(5b)
I
lIy(x,
y) =
2k [u(x, Y
+ k)
- u(x, Y - k)].
SEC. 21.4
911
Methods for Elliptic PDEs
We now turn to second derivatives. Adding (4a) and (4b) and neglecting terms in
h 5 , . • . , we obtain u(x + h, y) + u(x - h, y) = 2u(x, y) + h 2 ux .",(x, y). Solving for
u xx ' we have
h4,
(6a)
I
l/xx(x, Y) =
fl
uyy(x. y) =
k2
+
[u(x
h, y) - 2l/(x, y)
+
+ k) -
+ u(x. y
It(x -
/z, y)].
Similarly,
(6b)
1
[u(x. y
2u(x. y)
-
k)].
We shall not need (see Prob. 1)
1
+ h. \'
'4hk'
+ k)
UXy(x. v) = - - [u(x
(6c)
.
- u(x - h. v
.
- u(x
+
+
k)
h, y - k)
+
u(x - h, Y - k)].
Figure 452a shows the points (x + h, y), (x - h, y), ... in (5) and (6).
We now substitute (6a) and (6b) into the Poisson equation (3), choosing k
a simple formula:
(7)
u(x
+ h, y) + It(x, Y + h) + u(x -
h, y)
+ u(x, Y -
=
=
h) - 4lt(x, y)
h to obtain
h 2 f(x, y).
This is a difference equation corresponding to (3). Hence for the Laplace equation (2)
the corresponding difference equation is
(8)
u(x
+ h, y) +
u(x, y
+ h) +
u(x - h, y)
+
u(x, y - h) - 4u(x, y) = O.
h is called the mesh size. Equation (8) relates u at (x. y) to u at the four neighboring points
shown in Fig. 452b. It has a remarkable interpretation: u at (x, y) equals the mean of the
values of u at the four neighboring points. This is an analog of the mean value property
of harmonic functions (Sec. 18.6).
Those neighbors are often called E (East), N (North), W (West), S (South). Then
Fig. 452b becomes Fig. 452c and (7) is
(7*)
u(E)
+
u(N)
+
N
X
x
X
kl
hi
h
(x-h,y) x~x (x+h,y)
(x,y)
k
u(S) - 4u(x, y) = h 2 f(.'(, y).
(x,y+hJ
(x,y+k)
h
+
u(W)
I
h
(x-h,y)
x--
r"
.
hi
h
--x
(x,y)
(x+h,y)
h
x
(x,y-k)
(aJ Points in (5) and (6)
Fig. 452.
W X __
h_
-6 _h_X
(x,Y)
h
X
x
(x,y-h)
s
(b) POints In (7) and (8)
(e) Notation in (7*)
Points and notation in (5)-(8) and (7*)
E
912
CHAP. 21
Numerics for ODEs and PDEs
nn
Our approximation of h 2 ',pu in (7) and
is a 5-point approximation with the
coefficient scheme or stencil (also called pattern. lIloleeule. or star)
(9)
r
-4
I}' We
may
now write (7) a, {I
-4
I}
u
~ h'!,<,
y).
Dirichlet Problem
In numerics for the Dirichlet problem in a region R we choose an h and introduce a sljuare
grid of horizontal and vertical straight lines of distance h. Their intersections are called
mesh points (or lattice poillts or nodes). See Fig. 453.
Then we approximate the given PDE by a difference equation [(8) for the Laplace equation],
which relates the unknown values of u at the mesh points in R to each other and to the given
boundary values (details on p. 913). This gives a linear system of algebraic equations. By
solving it we get approximations of the unknown values of u at the mesh points in R.
We shall see that the number of equations equals the number of unknowns. Now come"
an important point. If the number of internal mesh points. call it p, is small, say, p < 100,
then a direct solution method may be applied to that linear system of p < 100 equations
in p unknowns. However, if p is large, a storage problem will arise. Now since each
unknown u is related to only 4 of its neighbors, the coefficient matrix of the system is a
sparse matrix, that is. a matrix with relatively few nonzero entries (for instance, 500 of
10000 when p = 100). Hence for large p we may avoid storage difficulties by using an
iteration method. notably the Gauss-Seidel method (Sec. 20.3), which in PDEs is also
called Liebmann's method. Remember that in this method we have the storage
convenience that we can overwrite any solution component (value of ll) as soon as a "new"
value is available.
Both cases, large p and small p, are of interest to the engineer, large p if a fine grid is
used to achieve high accuracy, and small p if the boundary values are known only rather
inaccurately, so that a coarse grid will do it because in this case it would be meaningless
to try for great accuracy in the interior of the region R.
We illustrate this approach with an example. keeping the number of equations small,
for simplicity. As convenient llOllitiOIlS for mesh poillts lllld correspondillg Vlt/ues of the
solution (and of approximate solutions) we use (see also Fig. 453)
Pij
(10)
=
(ih. jh),
llij
= uUh, jll).
y
x
Fig. 453. Region in the xy-plane covered by a grid of mesh h,
also showing mesh points Pll = (h, h), ... , Pij = (ih, jh), ...
SEC. 21.4
Methods for Elliptic PDEs
913
With this notation we can write (8) for any mesh point
(11)
EXAMPLE 1
Ui+l,j
+
Ui,j+l
+ Ui-l,j +
Pi)
in the form
4uij
Ui,j-l -
O.
=
Laplace Equation. Liebmann's Method
The four sides of a square plate of side 12 em made of homogeneous material are kept at constant temperature
ooe and IDOoe as shown in Fig. 454a. Using a (very wide) grid of mesh.j. cm and applying Liebmann's method
(that is, Gauss-Seidel iteration), find the (steady-state) temperature at the mesh points.
Solution.
In the case of independence of time. the heat equation (see Sec. 10.8)
reduces to the Laplace equation. Hence our problem i, a Dirichlet problem for the latter. We ehoo,e the grid
shown in Fig. 454b and consider the mesh points in the order Pn, P 21 . P 12 • P 22 . We use (11) and, in each
equation. take 10 the right all the terms resulting from the given boundary values. Then we obtain the system
=
+
u22
= -200
- 4"12 +
U22
= -100
(12)
Un
U21
-200
+
U12 -
4U22 =
-100.
In practice, one would solve such a small system by the Gauss elimination, finding "n = "21 = 87.5.
"12 = U22 = 62.5.
More exact values (exact to 3S) of the solution of the actual problem [as opposed to its model (12)] are 88.1
and 61.9. respectively. (These were obtained by using Fourier series.) Hence the error is about 1%. which is
surprisingly accurate for a grid of such a large mesh size h. If the system of equations were large, one would
solve it by an indirect method, such as Liebmann's method. For (12) this is as follows. We write (12) in the
form (divide by -4 and take terms to the right)
[/n =
0.25"21
+
1112 = 0.25"n
0.25u21
+
+ 50
0.25u12
+
0.25u22
+ 50
+
0.25u22
+ 25
+ 25.
0.25u12
These equations are now used for the Gauss-Seidel iteration. They are identical with (2) in Sec. 20.3, where
un = Xl' U21 = X2, "12 = X3, U22 = X4. and the iteration is explained there, with 100, 100, 100, 100 chosen
as starting values. Some work can be saved by better starting values, usually by taking the average of the
bound:.u)' values that enter into the linear system. The exact solution of the system is Ull = "21 = 87.5.
u12 = U22 = 62.5, as you may verify.
u=o
y
u=o
121------'"'1
P 22
P12
*-4
u = 100
--,
R
POl
Ipll
u = 100
I p21
-e - .
IPlO 'P
I
20
U
(a) Given problem
=
100
(b) Grid and mesh points
Fig. 454.
Example 1
914
CHAP. 21
Numerics for ODEs and PDEs
Remark.
It is interesting to note that if we choose mesh h = LlII (L = side of R) and consider the (II - 1)2
internal
me~h
points (i.e .. mesh points not on the boundary) row by row in the order
then the system of equations has the
(11 -
X (II -
1)2
1)2
coefficient matrix
-4
B
-4
B
(13)
Here
A =
B=
-4
B
-4
B
is an (II - I) X (11 - 1) matrix. (In (12) we have 11 = 3. (11 - 1)2 = 4 internal mesh points. two submatrices
B. and two submatrices I.) The matrix A is nonsingular. This follows by noting that the off-diagonal entries in
each row of A have the sum 3 (or 2). wherea~ each diagonal entry of A equals -4. so that non~ingularity is
implied by Gerschgorin's theorem in Sec. 20.7 because no Gerschgorin disk can include O.
•
A matrix is called a band matrix if it has all its nonzero entries on the main diagonal
and on sloping lines parallel to it (separated by sloping lines of zeros or not). For example.
A in (13) is a band matrix. Although the Gauss elimination does not pre~erve zeros between
bands. it does not introduce nonzero entries outside the limits defined by the original
bands. Hence a band structure is advantageous. In (13) it has been achieved by carefully
ordering the mesh points.
ADI Method
A matrix is called a tridiagonal matrix if it has all its nonzero entries on the main diagonal
and on the two sloping parallels immediately above or below the diagonal. (See also
Sec. 20.9.) In this case the Gauss elimination is particularly simple.
This raises the question of whether in the solution of the Dirichlet problem for the
Laplace or Poisson equations one could obtain a system of equations whose coefficient
matrix is tridiagonal. The answer is yes, and a popular method of that kind, called the
ADI method (alternating direction implicit method) was developed by Peaceman and
Rachford. The idea is as follows. The stencil in (9) shows that we could obtain a tridiagonal
matrix if there were only the three points in a row (or only the three points in a column).
This suggests that we write (II) in the form
(l4a)
Ui-l,j -
4Uij
+
ui+I,j
=
-Ui,j-l -
lIi,j+I
so that the left side belongs to y-Row j only and the right side to x-Column i. Of course,
we can also write (11) in the form
(l4b)
lIi.j-1 -
4Uij
+
Ui.j+l
=
-Ui-I,j -
ui+l,j
so that the left side belongs to Column i and the right side to Row j. In the AD! method
we proceed by iteration. At every mesh point we choose an arbitrary starting value u~T.
In each step we compute new values at all mesh points. In one step we use an iteration
SEC 21.4
91S
Methods for Elliptic PDEs
formula resulting from (14a) and in the next step an iteration formula resulting from (14b),
and so on in alternating order.
In detail: suppose approximations lI~j) have been computed. Then, to obtain the next
approximations lI~'JHl), we substitute the 1I~'J') on the right side of (l4a) and solve for the
l/~j'H1) on the left side; that is, we use
(lSa)
We use (lSa) for a fixed j, that is, for a fixed row j, and for all internal mesh points in
this row. This gives a linear system of N algebraic equations (N = number of internal
mesh points per row) in N unknowns, the new approximations of l/ at these mesh points.
Note that (ISa) involves not only approximations computed in the previous step but also
given boundary values. We solve the system (15a) (j fixed!) by Gauss elimination. Then
we go to the next row, obtain another system of N equations and solve it by Gauss. and
so on, until all rows are done. In the next step we alternate direction, that is, we compute
the next approximations u~jn+2) column by column from the ugn + ll and the given boundary
values, using a fonnula obtained from (l4b) by substituting the u~jn+]) on the right:
(lSb)
For each fixed i, that is. for each colulIlll. this is a system of M equations (M = number
of internal mesh points per column) in M unknowns. which we solve by Gauss elimination.
Then we go to the next column. and so on, until all columns are Jone.
Let us consider an example that merely serves to explain the entire method.
E X AMP L E 2
Dirichlet Problem. ADI Method
Explain the procedure and formulas of the AD! method in terms of the problem in Example I. using the same
grid and starting values 100. 100. 100. 100.
Solution.
While working. we keep an eye on Fig. 454b on p. 913 and the given boundary values. We obtain
11\]1.. Iti~. lI\]d from (15a) with 111 = O. We w,ite boundary values contained in (15a)
first approximations
without an upper index. for better identification and to indicate that these given values remain the same during
the iteration. From (15a) with 111 = a we have for j = I (first row) the system
"ttl..
(i~1)
(i
The solution is
ilill =
=
2)
u~li ~ 100. Fori = 2 (second row) we obtain fium (15a) the system
(i = I)
(i = 2)
The solution is /li~ ~
/I\]d
~ 66.667.
Secolld approximatiolls 1tW,. iI~1. Iti~. /I~i are now obtained from (l5b) with 111 = I by using the first
approximations just computed and the boundary values. For i ~ I (first column) we obtain from (15b) the system
(J = 1)
= -1101 -
II'll
(J = 2)
The solution is
/lfl. =
91.11.
,,cli =
(J =
I)
(J
2)
~
64.44. For i = 2 (second column) we obtain from (15b) the system
The solutIOn is /I~l = 91.11, /I~i = 64,44.
=
-1t'iV. - It:ll
Numerics for ODEs and PDEs
CHAP. 21
916
In this example. which merely serves to explain the practical procedure in the AD! method. the accuracy
of the second approximations is about the same as that of two Gauss-Seidel steps in Sec. 20.3 (where
IIU = xl' lt2l = .\"2' lt12 = X3' 1122 = X4)' as the following table shows.
-Method
lin
lI21
AD!. 2nd approximations
Gauss-Seidel, 2nd approximations
Exact solution of (12)
91.11
93.7S
87.S0
91.11
90.62
87.50
-
64.44
6S.62
62.50
64.44
64.06
62.50
•
Improving Convergence. Additional improvement of the convergence of the ADI
method results from the following interesting idea. Introducing a parameter p, we can also
write (11) in the form
(a)
Ui-l,j -
(2
+ P)Uij + ui+l,j
=
-Ui,j-l
+
(2 -
(b)
lli,j-l -
(2
+ p)uij + ui.j+l
=
-Ui-l,j
+
(2 - P) lI ij
P)Uij -
Ui,j+l
(16)
-
lIi+l,j'
This gives the more general ADI iteration formulas
(a)
(17)
(b)
u\m+2) _
1,,)-1
(')
....
+ p)1l\1n+2) +
1,)
1l\,,!-+2)
1.,)+1
=
_u\m+D
z-I,)
+ ("") L.
p)u(m+D 'lJ
u(m+~)
1,+1,) .
For P = 2. this is (IS). The parameter p may be used for improving convergence. Indeed,
one can show that the ADI method converges for positive p. and that the optimum value
for maximum rate of convergence is
Po
(18)
=
'iT
2 sin K
where K is the larger of M + I and N + 1 (see above). Even better results can be achieved
by letting P vary from step to step. More details of the ADI method and variants are
discussed in Ref. rE2S] listed in App. 1.
---.•.--.......-........
..
-.-~
--
..
..........
,,-~--.-.
1. Derive (Sb). (6b). and (6c).
12-71
GAUSS ELIMINATION,
GAUSS-SEIDEL ITERATION
For the grid in Fig. 455 compute the potential at the four
internal points by Gauss and by 5 Gauss-Seidel steps with
starting value~ 100. 100, 100, 100 (showing the details of
your work) if the boundary values on the edges are:
2.
3
II = 0 on the left. x on the lower edge, 27 - 9.\'2 on
the righl. x 3 - 27x on the upper edge.
ue.
lIe I, 0) = 60.
0) = 300. II = lOO on the other three
edges.
4. 1/ = X4 on the lower edge. 81 - 54.\'2 + )"4 on the right.
x 4 - 54x 2 + 81 on the upper edge, y4 on the left.
Verity the exact solution X4 - 6x 2y2 + y4 and
determine Ihe elTor.
3.
5.
II
6.
U = no on the upper and lower edges, lID on the left
and right.
= sin !7TX on the upper edge, 0 on the other edges.
10 steps.
SEC 21.5
Neumann and Mixed Problems. Irregular Boundary
7. Vo on the upper and lower edges, - Vo on the left and
right. Sketch the equipotentIal lines.
917
use symmetry; take II = 0 as the boundary value at the
two points at which the potential has a jump.
u = 110 V
"~110V~"~11",
2
y
x
Fig. 455.
2
"~-110Vt.fj "~-110V
3
Problems 2-7
u=-110V
8. Verify the calculations in Example 1. Find out
experimentally how many steps are needed to obtain the
solution of the linear system with an accuracy of 3S.
9. tUse of symmetry) Conclude from the boundary
values in Example I that U21 = Un and U22 = U12'
Show that this leads to a system of two equations and
solve it.
10. (3 X 3 grid) Solve Example I, choosing h = 3 and
starting values 100, 100, ....
11. For the square 0 ~ x ~ 4, 0 ~ y ~ 4 let the boundary
temperatures be O°C on the horizontal and 50°C on the
vertical edges. Find the temperatures at the interior
points of a square grid with h = I.
12. Using the answer to Prob. II, try to sketch some
isotherms.
13. Find the isotherms for the square and grid in Prob. II
if U = sin ~'iTx on the horizontal and -sin ~'iTY on the
vertical edges. Try to sketch some isotherms.
14. (Intluence of starting values) Do Prob. 5 by
Gauss-Seidel, starting from O. Compare and comment.
15. Find the potential in Fig. 456 using (a) the coarse grid,
(b) the fine grid, and Gauss elimination. Hint. In (b),
21.5
Fig. 456.
Region and grids in Problem 15
16. (ADI) Apply the ADI method to the Dirichlet problem
in Prob. 5, using the grid in Fig. 455, as before and
starting values zero.
17. What Po in (18) should we choose for Prob. 16? Apply
the ADI formulas (17) with Po = l. 7 to Prob. 16,
performing I step. lllustrate the improved convergence
by comparing with the corresponding values 0.077,
0.308 after the first step in Prob. 16. (Use the starting
values zero.)
18. CAS PROJECT. Laplace Equation. la) Write a
program for Gauss-Seidel with 16 equations in 16
unknowns, composing the matrix (13) from the
indicated 4 X 4 submatrices and including a
transfOimation ofthe vector ofthe boundary values into
the vector b of Ax = b.
(b) Apply the progranl to the square grid in 0 ~ x ~ 5.
o ~ y ~ 5 with h = I and u = 220 on the upper and
lower edges, U = 110 on the left edge and u = -10
on the right edge. Solve the linear system also by Gauss
elimination. What accuracy is reached in the 20th
Gauss-Seidel step?
Neumann and Mixed Problems.
Irregular Boundary
We continue our discussion of boundary value problems for elliptic PDEs in a region R
in the xy-plane. The Dirichlet problem was studied in the last section. In solving Neumann
and mixed problems (defined in the last section) we are confronted with a new situation,
because there are boundary points at which the (outer) normal derivative Un = au/an of
the solution is given, but u itself is unknown since it is not given. To handle such points
we need a new idea. This idea is the same for Neumann and mixed problems. Hence we
may explain it in connection with one of these two types of problems. We shall do so and
consider a typical example as follows.
918
E X AMP L E 1
CHAP. 21
Numerics for ODEs and PDEs
Mixed Boundary Value Problem for a Poisson Equation
Solve the mixed boundary value problem for the Poisson equation
shown in Fig.
~57a.
1.5
x
u=o
(a)
Regio~
(b) Grid (h = 0.5)
R and boundary values
Fig. 457.
Mixed boundary value problem in Example 1
Solution.
We u~e the grid shown in Fig. 457b. where" = 0.5. We recall that (7) in Sec. 21A has the right
side f (x. y) = 0.5 2 • 12xy = 3xy. From the formulas II = 3,·3 and I/ n = 6x given on the boundary we compute
the bounda.ry data
,,2
(1)
PH
1/31 = 0.375.
and
P21
ilU12
U32 =
3.
ely
= 6·0.5 = 3,
=
6·1 = 6.
are internal mesh points and can be handled as in the last section. Indeed. from (7), Sec. 21.4. with
{,2 = 0.25 and ,,2f (X. y) = 3xy and from the given boundary values we obtain two equations corresponding to
PH and P 21 • as follows (with -0 resulting from the left boundary).
12(O.5·0.5)·! - 0 = 0.75
(2a)
+ 1122
=
12(1 • 0.5)·! - 0.375
=
1.125
The only difficulty with these equations seems to be that they involve the unknown values U12 and 1/22 of 1/ at
P 12 and P22 on the boundary. where the normal derivative I/ n = ol//iln = flu/o)" is given. instead of 1/: but we
shall overcome this difficulty as follows.
We consider P12 and P 22 . The idea that will help us here is thb. We imagine the region R to be extended
above to the first row of external mesh points (corresponding to y = 1.5), and we assume that the Poisson
equation also holds in the extended region. Then we can write down two more equations as before (Fig. 457b)
=
1.5 - 0 = 1.5
(2b)
+ "23
=3
- 3
= O.
On the right. 1.5 is 12xy,,2 at (0.5. I) and 3 is 12x.'"{,2 at (I. 1) and 0 (at P02 ) and 3 (at P32 ) are given boundary
values. We remember that we have not yet used the boundary condition on the upper part of the boundary of
R, and we also notice that in (2b) we have introduced two more unknowns 1/13' 1/23' But we can now use that
comlilion and get rid of "13' 1/23 by applying the central difference formula for dl/ Idy. From (I) we then obtain
(see Fig. 457b)
3=
6=
flll12
oy
01/22
oy
1/13 - 1/11
= 1113 - lIll.
21z
1/23 2h
hence
1/13
hence
1123
=
1111 + 3
u21
= 1I23 - lI21'
=
"21 + 6.
SEC. 21.5
Neumann and Mixed Problems. Irregular Boundary
919
Substituting these results into (2b) and simplifying, we have
2"11
2"21
41112
+
+
1112 -
1122
4U22
= 1.5 - 3 = -1.5
= 3 - 3 - 6 = -6.
Together with (2a) this yields, written in matrix form,
(3)
r
-~ -4 0 ~] r::::] r~:::5] r ~:::5].
=
2
0
o
2
1
1112
-4
1122
-4
1.5 - 3
-1.5
~0-6
-6
(The entnes 2 come from U13 and l/23, and so do -3 and -6 on the right). The solution of (3) (obtained by
Gauss elimination) is as follows; the exact values of the problem are given in parentheses.
U12
= 0.R66
un = 0.077
= I.RI2
(exact I)
U22
lexact 0.125)
"21 =
0.191
(exact 2)
(exact 0.25).
•
Irregular Boundary
We continue our discussion of boundary value problems for elliptic PDEs in a region R
in the xy-plane. If R has a simple geometric shape, we can usually arrange for certain
mesh points to lie on the boundary C of R, and then we can approximate partial derivatives
as explained in the last section. However, if C intersects the grid at points that are not
mesh points, then at points close to the boundary we must proceed differently, as follows.
The mesh point 0 in Fig. 458 is of that kind. For 0 and its neighbors A and P we obtain
from Taylor'S theorem
(a)
UA
=
+
uh
lIo -
h
Uo
(4)
(b)
lip
=
auo
ax
auo
ax
+
+
1
-
2
-
I
2
a2uo
(uhf -'-2-
ax
h2
2
a uo
2
ax
+ ...
+
We disregard the terms marked by dots and eliminate alia lax. Equation (4b) times a plus
equation (4a) gives
p
Q
Fig. 458.
c
Curved boundary C of a region R, a mesh point 0 near C, and neighbors A, B, P, Q
920
CHAP. 21
Numerics for ODEs and PDEs
We solve this last equation algebraically for the derivative, obtaining
;PU o
ax 2
2 [I
=
+ a)
a(l
h2
I
+~
UA
I
Up -
-;; U o
lIQ -
I
-b
]
Similarly, by considering the points O. B. and Q.
a2Uo
2
iJy
2
h
=
-2
[1+
b(\
b)
+ - -I I +b
LIB
Uo
]
.
By addition,
(5)
y 2 U o = -22
[UA
h
For example, if a
a (l
=
+
a)
+
+
liB
b (1
+
Up
1+a
b)
+ ~ _ ta + b)uo ]
I +b
ab
.
!, b = !, instead of the stencil (see Sec. 21.4)
we now have
because 1/[a(1 + a)l = ~, etc. The sum of all five terms still being zero (which is useful
for checking).
Using the same ideas, you may show that in the case of Fig. 459.
2
(6) V U o
2
1z2
= -
[UA +
a(a
+ p)
UB
b(b
+
q)
+
Up
pep
+
a)
+
ap + bq
abpq
uQ
--~-
q(q
+
b)
]
Uo
,
a formula that takes care of all conceivable cases.
B
bh
0
ph
ah
P~--""---~A
qh
Q
Neighboring points A. B. P, Q of a
mesh point 0 and notations in formula (6)
Fig. 459.
E X AMP L E 2
Dirichlet Problem for the Laplace Equation. Curved Boundary
Find the potential II in the region in Fig. 460 that has the boundary values given in that figure: here the curved
portion of the boundary is an arc of the circle of radius 10 about (0, 0). Use the grid in the figure.
3
II is a solution of the Laplace equation. From the given formulas for the boundary values II = x ,
512 - 24y2 . ... we compute the values at [he points where we need them: the result is shown in [he figure.
For Pll and P 12 we have the usual regular' stencil. and for P21 and P 22 we use (6), obtaining
Solution.
1/ =
0.5
(7)
-2.5
0.5
0+
0.9
-3
0.6
0+
SEC. 21.5
Neumann and Mixed Problems. Irregular Boundary
921
y
6
u=512-24/
u=o
u=o
3
u = 296
u=
216
u = 27
°O~----~3~~--~6--~8~~x
3
U =X
Region, boundary values of the
potential, and grid in Example 2
Fig. 460.
We use this and the boundary
obtain the system
value~
and take the mesh points in the
o-
- 4u n +
0.6ll n
+
- 2. 5u21
0.6U21
0.5U22 =
+ 0.61112 -
u~ual
order P lI • P 21 , P 12 • P 22 . Then we
27
=
-27
-0.9' 296 - 0.5 . 216 = -374.4
+0
U22
=
702
31122
=
0.9 . 352
702
+ 0.9 . 936 = 1159.2.
Tn matrix form,
-2.5
(8)
o
0
-4
0.6
0.6
:.5]
I
r:: :] r-~~:.4].
-3
1112
702
1122
1159.2
Gauss elimination yields the (ruundedl value,
Uu = -55.6,
U21 =
49.2,
ll12 =
-298.5,
ll22 =
-436.3.
Clearly, from a grid with so few mesh points we cannot expect great accuracy. The exact solution of the PDE
(not of the difference equation) having the given boundary values is u = x 3 - 3xy2 and yields the values
Ull =
-54,
!l21 =
54,
U12 =
-297,
U22 =
-432.
Tn praclice one would use a much finer grid and solve the resulting large system by an indirect
method.
•
1. Verify the calculation for the Poisson equation in
Example 1. Check the values for (3) at the end.
2. Delive (5) in particular when a = h =
!.
3. Deri ve the general stencil formula (6) in all detail.
4. Verify the calculation for the boundary value problem
in Example 2.
5. Do Example I in the text for "\7 2 u = 0 with grid and
boundary data as before.
MIXED BOUNDARY VALUE PROBLEMS
6. Solve the mixed boundary value problem for the
Laplace equation "\7 2 u = 0 in the rectangle in Fig. 457a
(using the grid in Fig. 457b) and the boundary
conditions u~ = 0 on the left edge, U x = 3 on the right
edge, u = x 2 on the lower edge, and u = x 2 - 1 on
the upper edge.
7. Solve Prob. 6 when un = 1 on the upper edge and
u = 1 on the other edges.
922
CHAP. 21
Numerics for ODEs and PDEs
8. Solve the mixed boundary value problem for the
Poisson equation ,2u = 2(x 2 + y2) in the region and
for the boundary conditions shown in Fig. 461, using
the indicated grid.
y
/u=o
3
u
2
0
Problems 8 and 10
9. CAS EXPERIMENT. Mixed Problem. Do Example
1 in the text with finer and finer grids of your choice
and study the accuracy of the approximate values by
comparing with the exact solution u = 2.\"y3. Verify
the latter.
10. Solve , 211 = -1T2y sin! 1T.\' for the grid in Fig. 461
and lIy(l, 3) = lI y (2, 3) = !vW, II = 0 on the other
three sides of the square.
1/
0
Fig. 462.
11--<>--=--<r""--~
u=o
P21
3
x
Problem 11
12. If in Prob. II the axes are grounded (II = 0), what
constant potential must the other portion of the
boundary have in order to produce 100 volts at Pu?
13. What potential do we have in Prob. II if L/ = 190 volts
on the axes and L/ = 0 on the other portion of the
boundary?
14. Solve the Poisson equation y 2L/ = 2 in the region and
for the boundary values shown in Fig. 463, using the
grid also shown in the figure.
y
3
u=/-3y----.......
1.5
n-<>----I~ u = /
IRREGULAR BOUNDARIES
11. Solve the Laplace equation in the region and for the
boundary values shown in Fig. 462. using the indicated
grid. (The sloping portion of the boundary is
y = 4.5 - x.)
21.6
2
u =3x
u=o---"
Fig. 461.
- 1.5x
~u=9-3y
P ll
2 I----{>--='::.....,:>---='--~
2
P12
u=o---
y
=X
- 1.5y
u=o
Fig. 463.
Problem 14
Methods for Parabolic PDEs
The last two sections concerned elliptic POEs. and we now tum to parabolic POEs. Recall
that the definitions of elliptic, parabolic, and hyperbolic POEs were given in Sec. 21.4.
There it was also mentioned that the general behavior of solutions differs from type to
type, and so do the problems of practical interest. This reflects on numerics as follows.
For all three types, one replaces the POE by a corresponding difference equation, but
for parabolic and hyperbolic POEs this does not automatically guarantee the convergence
of the approximate solution to the exact solution as the mesh h ~ 0; in fact, it does not
even guarantee convergence at all. For these two types of POEs one needs additional
conditions (inequalities) to assure convergence and stability, the latter meaning that ~mall
perturbations in the initial data (or small errors at any time) cause only small changes at
later times.
In this section we explain the numeric solution of the prototype of parabolic PDEs, the
one-dimensional heat equation
(c constant).
SEC. 21.6
923
Methods for Parabolic POEs
This POE is usually considered for x in some fixed interval, say, 0 :;;; x :;;; L, and time
t ~ 0, and one prescribes the initial temperature u(x, 0) = f(x) (f given) and boundary
conditions at x = 0 and x = L for all t ~ 0, for instance 1/(0. t) = 0, I/(L, t) = O. We may
assume c = I and L = I; this can always be accomplished by a linear transformation of
x and t (Prob. I). Then the heat equation and those conditions are
O:;;;x:;;;Lt~O
(1)
(2)
II(X,
(3)
u(O. t)
=
0)
=
(Initial condition)
f(x)
u(], t)
=0
(Boundary conditions).
A simple finite difference approximation of (1) is [see (6a) in Sec. 21.4;j is the number
of the time step]
(4)
]
k
1
(Ui,j+l -
Uij)
= /7 2
(Ui-l,j -
2Uij
+ Ui-l,j)'
Figure 464 shows a corresponding grid and mesh points. The mesh size is h in the
x-direction and k in the t-direction. Formula (4) involves the four points shown in
Fig. 465. On the left in (4) we have used ajonmrd difference quotient since we have no
information for negative t at the start. From (4) we calculate ui.j+l' which corresponds to
time row j + 1, in terms of the three other II that correspond to time row j. Solving (4)
for lIi,j+I' we have
(5)
lIi,j+1 =
(1 -
+
2r)uij
r(ui+l,j
+
Ui-l,j),
u = 0 --....,..,I---o-----<J---o----l (j = 2)
~u=O
I---o-----<J---o----l(j=l)
k h
u
Fig. 464.
!
x
= {(x)
Grid and mesh points corresponding to (4), (5)
(i,j + 1)
X
/k
( i - l . j ) X - - - h - - X - - - h - - X (i + l,j)
(i,j)
Fig. 465.
The four points in (4) and (5)
r=
924
CHAP. 21
Numerics for ODEs and PDEs
Computations by this explicit method based on (S) are simple. However, it can be
shown that crucial to the convergence of this method is the condition
(6)
r=
That is. Uij should have a positive coefficient in (S) or (for r =~) be absent from (S). Intuitively,
(6) means that we should not move too fast in the (-direction. An example is given below.
Crank-Nicolson Method
Condition (6) is a h<mdicap in practice. Indeed, to attain sufficient accuracy, we have to
choose h small, which makes k very small by (6). For example, if h = 0.1, then k ::::; O.OOS.
Accordingly, we should look for a more satisfactory discretization of the heat equation.
A method that imposes no restriction on r = klh 2 is the Crank-Nicolson method,
which uses values of u at the six points in Fig. 466. The idea of the method is the
replacement of the difference quotient on the right side of (4) by ~ times the sum of two
such difference quotients at two time rows (see Fig. 466). Instead of (4) we then have
I
k
1
(Ui,j+l -
Uij) =
2h2 (Ui+l,j
(7)
+
-
2uij
I
2h2 (Ui+l,j+l -
2Ui,j+1
+
Ui-l,j+l)'
Multiplying by 2k and wliting r = klh 2 as before, we collect the terms corresponding to
time row j + I on the left and the terms corresponding to time row j on the right:
(8)
(2
+ 2r)Ui,j+1
-
rtUi+I,j+l
+
lIi-l,j+l)
= (2 -
2r)uij
+
r(ui+l,j
+ Ui-l,j)'
How do we use (8)? In general, the three values on the left are unknown, whereas the
three values on the right are known. If we divide the x-interval 0 ::::; x ::::; I in (I) into
II equal intervals, we have II 1 internal mesh points per time row (see Fig. 464, where
II = 4). Then for j = 0 and i = I,···. II I. formula (8) gives a linear system
of II - I equations for the II - I unknown values Ull, 11 2 1, ••• , U n -I,1 in the first time
row in terms of the initial values lIoo, UlO, . • • , UnO and the boundatl' values Um (= 0).
lInl (= 0). Similarly for j = l,.i = 2, and so on; that is, for each time row we have to
solve such a linear system of II - I equations resulting from (8).
Although r = klh 2 is no longer restricted, smaller r will still give better results. In
practice, one chooses a k by which one can save a considerable amount of work, without
making r too large. For instance, often a good choice is r = 1 (which would be impossible
in the previous method). Then (8) becomes simply
(9)
411i ,j+l -
ui+l,j+l -
Ui-l,j+l
=
Time rowj + 1
X
X
Time rowj
X
X
ui+l,j
+ Ui-l,j'
X
Ik
h
h
Fig. 466. The six points in the CrankNicolson formulas (7) and (8)
X
SEC. 21.6
925
Methods for Parabolic PDEs
j=5
0.20
0.16
.
.. . ..
0.12
P12
j=3
P 22
r-
j=2
----- ----I
Ip
d
----- - - ----- e - - j = 1
I
IplO
Ip40
P20
P30
j=O
0.2
0.4
0.6
0.8
1.0
;=1
;=3
i = 2
i =4
i= 5
0.08
---
IP
0.04
t= 0
x=O
u
Grid in Example 1
Fig. 467.
E X AMP L E 1
j=4
Temperature in a Metal Bar. Crank-Nicolson Method, Explicit Method
Consider a laterally insulated metal bar of length I and such that c 2 = I in the heat equation. Suppose that the
ends of the bar are kept at temperature" = O°C and the temperature in the bar at some instant -call it I = 0i, .f(x) = sin 1T.\' Applying the Crank-Nicolson method with h = 0.2 and r = I, find the temperature u(x. f) in
the bar for 0 ~ t ~ 0.2. Compare the results with the exact solution. Also apply (5) with an r satisfying (6),
say, r = 0.25. and with values nO! satisfying (6). say. ,. = I and,. = 2.5.
Solution by Crank-Nicolson.
Since,. = I. formula (8) takes the form (9). Since h = 0.2 and
klh 2 = I. we have k = h 2 = 0.04. Hence we have to do 5 steps. Figure 467 shows the grid. We shall need
the initial values
,. =
1110
=
sin 0.277
=
0.587785.
"20 = sin 0.41T = 0.951 057.
Also, "30 = "20 and "40 = lI1O' (Recall that lI10 means" at P10 in Fig. 467. etc.) In each time row in
Fig. 467 there are 4 internal mesh points. Hence in each time step we would have to solve 4 equations in 4
unknowns. But since the initial temperature distribution is symmetric with respect to x = 0.5, and II = 0 al
both ends for all t. we have lI31 = lI21, "41 = lIll in the first time row and similarly for the other rows. This
reduces each system to 2 equations in 2 unknowns. By (9). since 1/31 = 1/21 and 1101 = O. for j = 0 these
equations are
(i = I)
=
1100 + "20 = 0.951 057
(i = 2)
The solution is lin = 0.399274.
lI21 =
(i = I)
(i = 2)
The solution is
(Fig. 468):
I
U12
t
x=O
0.00
0.04
0.08
0.12
0.16
0.20
0
0
0
0
0
0
0.646039. Similarly, for time row j = I we have the system
4"12 -£l12
+
1122 = "01 +
£l21 =
0.646039
+
U21 =
1.045313.
31/22 =
"11
0.271 221. 1/22 = 0.438844, and so on. This gives the temperature distribution
X
= u.2
0.588
0.399
0.271
0.184
0.125
0.085
x = U.4
x = U.6
x = u.8
0.951
0.646
0.439
0.298
0.202
0.138
0.951
0.646
0.439
0.298
0.202
0.138
0.588
0.399
0.271
0.184
0.125
0.085
x=
0
0
0
0
0
0
926
CHAP. 21
Numerics for ODEs and PDEs
u(x. t)
1
Fig. 468.
Temperature distribution in the bar in Example 1
Comparison witll the exact solutio1l.
The present problem can be solved exactly by separating
variables (Sec. 12.5): the result is
(10)
= 0.25.
For h = 0.2 and r = klh 2 = 0.25 we have
k = rh = 0.25 . 0.04 = 0.0 I. Hence we have to perform 4 times as many step' as with the Crank-Nicolson
method! Formula (5) with,. = 0.25 is
Solution by the explicit method (5) with r
2
(11)
We can again make use of the symmetry. For j
"20 = "30 = 0.951 057 and compute
= 0 we need
"00
= O.
1110
= 0.5877!lS (see p. 925).
lin = 0.25(1100 + 21110 + "20) = 0.531 657
U21
= 0.25 (1110 + 2"20 +
Of cou[,e we can omit the boundary terms
compute
1130)
O.
1101 =
1112 = 0.25(21111
1122
= 0.25 (1110 +
+
"02 =
31120)
=
O.R(iO 239.
0.··· from the formulas. For j
I we
1121) = 0.480888
= 0.25(1111 + 31121) = 0.778094
and so on. We have to perform 20 steps in,tead of the S CN steps. but the numeric values ,how that the accuracy
is only about the same as that of the Crank-Nicolson values CN. ll1e exact 3D-values follow from (10).
x
=
0.2
x = 0.4
t
0.04
0.08
0.12
0.16
0.20
CN
By (11)
Exac[
CN
By (11)
Exac[
0.399
0.271
0.184
0.125
0.085
0.393
0.263
0.176
0.118
0.079
0.396
0.267
0.180
0.121
0.646
0.439
0.298
0.202
0.637
0.426
0.285
0.191
0.641
0.432
0.291
0.196
0.082
0.138
0.128
0.132
Failure of (5) with r violati1lg (6).
Formula (5) with h = 0.2 and r = I-which violates (6)-is
SEC. 21.6
Methods for Parabolic PDEs
and
give~
927
very poor values: ,orne of the,e are
T =
0.04
0.12
0.20
FOlmula (5) with an even larger r
these are
0.1
0.3
0.363
0.139
0.053
= (I -
2r)uoj
+
Exact
T
0.396
0.180
0.082
=
0.4
0.588
0.225
0.086
Exact
0.641
0.291
0.132
= 2.5 (and h = 0.2 as before) gives completely nonsensical results: some of
x = 0.2
Exact
0.1)265
0.0001
0.2191
0.0304
1. (Nondimensional form) Show that the heat equation
ut = c 2 uj'j', 0 ~ X ~ L, can be transformed to the
"nondimensionar' standard fonn lit = l(ox. 0 ~ x ~ I.
by setting x = .WL. t = c 271L2. II = uluo, where lto is
any constant temperature.
2. Derive the difference approximation (4) of the heat
equation.
3. Derive (5) from (4).
4. Using the explicit method [(5) with II = I and k = 0.5].
find the temperature at t = 2 in a laterally insulated
bar of length 10 with ends kept at temperature 0 and
initial temperature fIx) = x - 0.lx 2 •
5. Solve the heat problem (I )-(3) by Crank-Nicolson
for 0 ~ t ~ 0.20 with II = 0.2 and k = 0.04 when
fIx) = x if 0 ~ x < !. lex) = I - x if ! ~ x ~ I.
Compare with the exact values for t = 0.20 obtained
from the series (2 terms) in Sec. 12.5.
6. Solve Prob. 5 by the explicit method with h = 0.2 and
k = 0.01. Do II steps. Compare the last values with the
Crank-Nicolson 3S-values 0.107. 0.175 and the exact
3S-values 0.108, 0.175.
7. The accuracy of the explicit method depends on
r (~!). Illustrate this for Prob. 6. choosing r = ~ (and
h = 0.2 as before). Do 4 steps. Compare the values for
t = 0.04 and 0.08 with the 3S-values in Prob. 6, which
are 0.156. 0.254 (t = 0.04).0.105,0.170 (t = 0.08).
8. If the left end of a laterally insulated bar extending
from x = 0 to x = I is insulated. the boundary condition
at x = 0 is 11,,(0. t) = u.,.(O. t) = O. Show that in the
application of the explicit method given by (5), we can
compute IIO,j+ I by the fomlllla
1I0. j + I
0.2
2ruij'
Apply this with II = 0.2 and r = 0.25 to determine the
temperature II(X. t) in a laterally insulated bar extending
fi'om x = 0 to I if lI(X, 0) = O. the left end is insulated
9.
10.
11.
12.
x
=
0.4
0.0429
0.0001
Exact
0.3545
0.0492.
•
and the right end is kept at temperature get) = sin ¥'lTt.
Hint. Use 0 = Buo/ilx = (IIIj - IL1.) I2h.
In a laterally insulated bar of length I let the initial
temperature be fIx) = x if 0 ~ x ~ 0.2,
fIx) = O.lS( I - x) if 0.2 ~ x ~ I. Let u(O, t) = O.
11(1. t) = 0 for all t. Apply the explicit method with
II = 0.2, k = 0.01. Do 5 steps.
Solve Prob. 9 for f(x) = x if 0 ~ x ~ 0.5,
I(x) = I - x if 0.5 ~ x ~ l, all the other data being
as before. Can you expect the solution to satisfy
lI(x. t) = 11(1 - x, t) for all t?
Solve Prob. 9 by (9) with II = 0.2, 2 steps. Compare
with exact values obtained from the series in Sec. 12.5
(2 tenns) with suitable coefficients.
CAS EXPERIMENT. Comparison of Methods.
(a) Write programs for the explicit and the
Crank-Nicolson methods.
(b) Apply the programs to the heat problem of a
laterally insulated bar of length I with lI(x, 0) = sin 'lTX
and u(O, t) = 11(1. t) = 0 for all t, using h = 0.2,
k = 0.01 for the explicit method (20 steps), II = 0.2
and (9) for the Crank-Nicolson method (5 steps). Obtain
exact 6D-values from a suitable series and compare.
(e) Graph temperature curves in (b) in two figures
similar to Fig. 296 in Sec. 12.6.
(d) Expeliment with smaller h (0.1,0.05. etc.) for both
methods to find out to what extent accuracy increases
under systematic changes of II and k.
113-151
CRANK-NICOLSON
Solve (I )-(3) b) Crank-Nicolson with r = 1 (5 steps).
where:
13. fIx) = X( I - x), lz = 0.2
14. f(x) = x( I - x), h = O. I (Compare with Prob. 13.)
15. f(x) = 5x if 0 ~ x < 0.2. f(x) = 1.25(1 - x) if
0.2 ~ x ~ 1, h = 0.2
928
21.7
CHAP. 21
Numerics for ODEs and PDEs
Method for Hyperbolic PDEs
In thi~ section we consider the numeric solution of problems involving hyperbolic POEs.
We explain a standard method in terms of a typical setting for the prototype of a hyperbolic
POE. the wave equation:
=
o~ x
~ 1, 1 ~
0
(1)
U tt
(2)
u(x. 0)
=
j(x)
(Given initial displacement)
(3)
ut(x. 0)
=
g(x)
(Given initial velocity)
(4)
u(O, t)
=
U xx
u(l, t)
=0
(Boundary conditions).
Note that an equation U tt = c 2 u-r;x and another x-interval can be reduced to the form (I)
by a linear transformation of x and 1. This is similar to Sec. 21.6, Prob. I.
For instance, (1)-(4) is the model of a vibrating elastic string with fixed ~nds at
x = 0 and x = I (see Sec. 12.2). Although an analytic solution of the problem is given
in (13). Sec. 12.4, we lise the problem for explaining basic ideas of the numeric approach
that are also relevant for more complicated hyperbolic PDEs.
Replacing the derivatives by difference quotients as before, we obtain from (1) [see (6)
in Sec. 21.4 with y = t]
(5)
I
k2
(Ui,j+l -
2Uij
+
Ui,j-I)
=
I
h2
(Ui+l,j -
2Uij
+
Ui-I,j)
where h is the mesh size in x, and k is the mesh size in t. This difference equation relates
5 points as shown in Fig. 469a. It suggests a rectangular grid similar to the grids for
parabolic equations in the preceding section. We choose r* = k 2flz2 = L Then Uij drops
out and we have
(6)
(Fig. 469b).
It can be shown that for 0 < r* ~ 1 the present explicit method is stable, so that from
(6) we may expect reasonable results for initial data that have no discontinuities. (For a
hyperbolic POE the latter would propagate into the solution domain-a phenomenon that
would be difficult to deal with on our present grid. For unconditionally stable implicit
methods see [El] in App. L)
Equation (6) still involves 3 time steps j - I,j,j + J, whereas the formulas in the
parabolic case involved only 2 time steps. Furthermore, we now have 2 initial conditions.
x
Time rowj + 1
•
Time rowj
x-,-x
Ik
x--x--x
h
Ik h
x
(a) Formula (5)
Fig. 469.
Time rowj-l
I
X
(bl Formula (6)
Mesh points used in (5) and (6)
SEC. 21.7
Method for Hyperbolic PDEs
929
So we ask how we get stal1ed and how we can use the initial condition (3) This can be
done as follows.
From lItex. 0) = g(x) we derive the difference formula
I
2k
(7)
=
where &
(uil -
g(ih). For t
Into this we substitute
and by simplification
Ui,-l)
hence
= gi.
= O. that is, j = 0, equation (6) is
Ui.-l
as given in (7). We obtain
Ua
=
Ui-l,O
+
Ui+l,O -
+
Uil
2kgi
(8)
This expresses
E X AMP L E 1
lta
in terms of the initial data. It is for the beginning only. Then use (6).
Vibrating String, Wave Equation
Apply the present method "ith" = k = 0.1 to the problem (1)-(4). where
i(x) = sin
g(x) = O.
77X.
Solutio1l. The grid is the same as in Fig. 467, Sec. 21.6, except for the values of t, which now are 0.2.
004, ... (instead of 0.04. 0.08, ... ). The initial values
From (8) and gtx) = 0 we have
liDO, lIlO' . . •
From this we compute. using
0.587785.1120 = lI30 =
and
ltOI =
0.951 057,
1111 =
~(1I00
+
1120) =
~. 0.951 057 = 0.475528
2)
"21 =
~(1I1O
+
"30) =
~. 1.538841 = 0.769421
"31 = 1121,1141 = 1111
using
sin 0.277 =
(i = I)
(i =
and
lI10 = 1140 =
are the same as in Example 1, Sec. 21.6.
1102 = . . . =
by symmetry as in Sec. 21.6, Example I. From (6) withj = I we now compute,
0,
(i = I)
U12 = 1101
+
lI21 -
£lID =
0.76') 421 - 0.587785
(i = 2)
U22 = "11
+
£131 -
1I20 =
0.475528
+ 0.769421 - 0.951057
=
U.UH 636
=
0.293 !l92,
by symmetry; and so on. We thus obtain the following values of the dIsplacement
over the first half-cycle:
"32 = 1I22, "42 = lI12
II(X. t)
of the
~tring
x=O
0.0
0.2
0.4
0.6
0.8
1.0
0
0
0
0
0
0
x
= 0.2
x = 0.4
X
0.588
0.476
0.182
-0.182
-0.476
-0.588
0.951
0.769
0.294
-0.294
-0.769
-0.951
0.951
0.588
0
0.769
0.294
-0.294
-0.769
-0.951
0.476
0.182
-0.182
-0.476
-0.588
0
0
0
0
0
X
= 0.6
=
0.8
x=
CHAP. 21
930
The~e
Numerics for ODEs and POEs
values are exact to 3D (3 decimals), the exact solution of the problem being (see Sec. 12.3)
II(X,
I) = sin
TlX
cos
TIl.
The reason for the exactness follows from d'Alemberfs solution (4), Sec. 12.4. (See Prob. -I, below.)
•
This is the end of Chap. 21 on numerics for ODEs and POEs, a rapidly developing field
of basic applications and interesting research. in which large-scale and complicated
practical problems can now be attacked and solved by the computer.
,@
VIBRATING STRING
Solve (I )-(4) by the present method with h = k = 0.2 for
the given initial deflection fix) and initial velocity 0 on the
given (-interval.
~ ( ~ 2
I - x). 0 ~ ( ~ I
O.OlxO - x), 0
1. f(x)
=
2. f(x)
3. f(x)
= x2(
if 0
0.2 < x ~ I
= x
~
x
~
0.2. f(x)
0.25(1 - x) if
4. Show that from d' Alemberfs solution (13) in Sec. 11.4
with c = I it follows that (6) in the present section
gives the exact value 1I;,i+l = lI(ih, (j + l)ft).
5. If the string governed by the wave equation (I) starts
from its equilibrium position with initial velocity
g(x) = sin 'In. what is its displacement at time ( = 0.4
and x = 0.2. 0.4. 0.6. 0.8? (Use the present method
with h = 0.2. k = 0.2. Use (8). Compare with the exact
values obtained from (I2) in Sec. 12.4.)
.................
...
.......
_. __ .... -.. _ ....
......... __"a.-.-.~
1. Explain the Euler and Improved Euler methods in
geometrical terms.
2. What are the local and global orders of a method? Give
examples.
3. What do you know about error estimates? Why are they
important?
4. How did we obtain numeric mcthods by using the
Taylor series'?
5. In each Runge-Kutta step we computed auxiliary
values. How many? Why?
6. What are one-step and multistep methods? Give
examples.
7. What is the idea of a predictor--corrector method?
Mention some of these methods.
6. Compute approximate values in Prob. 5. using a finer
grid (ft = 0.1. k = 0.1), and notice the increase in
accuracy.
7. Illustrate the starting procedure when both f and g
are not identically zero. say, f(x) = I - cos 2'lTx,
g(X) = x - ,\'2. Choose ft = k = 0.1 and do 2 time steps.
8. Show that (I2) in Sec. 12.4 gives as another starting
formula
lin
I
= ~
(Ui+l.O
+
I
lIi-1.0)
+ ?
-
-
I
k
Xi
+
g(s) ds
Xi- k
(where one can evaluate the integral numerically if
necessary). In what case is this identical with (8)?
0.1,
9. Compute u in Prob. 7 for r = 0.1 and x
0.2, .. " 0.9, using the formula in Prob. 8. and
compare the values.
10. Solve (I)-(3) (h = k = 0.2, 5 time steps) subject to
f(x) = .\.2. g(x) = 2x. 11,.(0. t) = 2t. lIe I. t) = (l + 1)2.
STIONS AND PROBLEMS
10. What is automatic step size control? How is it done in
practice?
11. Why and how did we use finite differences in this
chapter,?
12. Make a list of types of PDEs. corresponding problems,
and methods for their numeric solution.
13. How did we approximate the Laplace equation? The
Poisson equation?
14. Will a difference equation give exact solutions of a PDE?
15. How did we handle (a) irregularly shaped domains.
(b) given normal derivatives at the boundary?
16. Solve y' = 2x.\', yeO) = I, by the Euler method with
ft = 0.1. 10 steps. Compute the en·or.
B- What is the idea of the Rungc-Kutta-Fehlberg method?
17. Solve y' = 1 + y2. yCO) = O. by the improved Euler
method with h = 0.1,5 steps. Compute the error.
9. How can Runge-Kutta be generalized to systems of
ODEs?
18. Solve y' = (x + y h = 0.2, 7 steps.
4)2, yeO)
= 4. by RK with
931
Chapter 21 Review Questions and Problems
19. Solve Prob. 17 by RK with II = 0.1, 5 steps. Compute
the error. Compare with Prob. 17.
20. (Fair comparison) Solve y' = 2x- i Vy - In x + X-I,
y(l) = 0 for I ~ x ~ 1.8 (a) by the Euler method with
h = 0.1, (b) by the improved Euler method with
h = 0.2. (c) by RK with h = 0.4. Verify that the exact
solution is y = (In X)2 + In x. Compute and compare
the errors. Why is the comparison fair?
21. Compute eX for x = 0, 0.1, .... 1.0 by applying RK
to y' = r. yeO) = I. h = 0.1. Show that the result is
5D-exact.
I~O-321
POTENTIAL
Find the potential in Fig. 471, using the given grid and the
boundary values:
30. u = 70 on the upper and left sides, U = 0 on the lower
and right sides
31. u{PlO) = u(P 30 ) = 960. u(P20 ) = -480, U = 0
elsewhere on the boundary
32. u(P Ol )
u(P oa )
u{p 41) = u(P 43) = 200,
u(P lO ) = u(P 30 )
400, u(P 20 ) = 1600,
u(P 02 ) = U(P 42 )
U(P 34 ) = 0
U(P I4 )
=
lI(P 24 ) =
22. Solve y' = (x + y)2, yeO) = 0 by RK with h = 0.2.
5 steps.
23. Show thaI by applying the method in Sec. 21.2 to a
polynomial of first degree we obtain the multistep
predictor and corrector formulas
)'~~+l
= Yn
+
Yn.1 = )'n
+
h (3fn -
2
fn-I)
Fig. 471.
where.f~ 11
=
f(Xn+l~ Y:+1)'
24. Apply the multistep method in Prob. 23 to the initial
value problem y' = x + y. yeO) = 0, choosing h = 0.2
and doing 5 steps. Compare with the exact values.
25. Solve y' = (y - x - 1)2 + 2. y(O) = 1 for 0 ~ x ~ I
by Adam~-Moulton with h = 0.1 <md starting values I.
1.200334589. 1.402709878. 1.609336039.
26. Solve y" + Y = O. y(O) = O. y' (0) = I by RKN with
h = 0.2, 5 steps. Find the elTOr.
27. Solve y~ = -4."1 + 3.\'2' y~ = 5)'1 - 6Y2' )'1(0) = 3.
."2(0) = -5. by RK for systems. h = O.L 5 steps.
28. Solve y~ = -5.\"1 + 3)'2')'~ = -3.\"1 - 5.\"2' .\"1(0) = 2.
Y2(0) = 2 by RK for systems. h = 0.1. 5 steps.
29. Find rough approximate values of the electrostatic
potential at Pu. P 12 , P 13 in Fig. 470 that lie in a field
between conducting plates (in Fig. 470 appearing as
sides of a rectangle) kept at potentials 0 and I IO volts
as shown. (Use the indicated grid.)
YI
4
u = 110 V
~_"""I"",1_.
'P13
2
u=o
u=o-
u=o
Fig. 470.
Problem 29
Problems 30-32
33. Verify (13) in Sec. 21.4 for the sy~tem (12) and show
that A in (\ 2) is nonsingular.
34. Derive the difference approximation of the heat equation.
35. Solve the heat equation (l). Sec. 21.6. for the initial
condition J(x) = x if 0 ~ x ~ 0.2, J(x) = 0.25(1 - x)
if 0.2 < x ~ I and boundary condition (3). Sec. 21.6.
by the explicit method [formula (5) in Sec. 21.6] with
h = 0.2 and k = 0.0 I so that you get values of the
temperature at time t = 0.05 as the answer.
36. A laterally insulated homogeneous bar with ends at
x = 0 and x = I has irlitial temperature O. Its left end
is kept at O. whereas the temperature at the right end
varies sinusoidally according to
u(t, I) = get) = sin ~1T1.
Find the temperature u(x. t) in the bar [solution of (1)
in Sec. 21.6] by the explicit method with h = 0.2 and
r = 0.5 (one period, that is, 0 ~ t ~ 0.24).
37. Find lI(X, 0.12) and lI(X, 0.24) in Prob. 36 if the left end
of the bar is kept at -g(t) (instead of 0). all the other
data being as before.
38. Find out how the results of Prob. 36 can be used for
obtaining the results in Prob. 37. Use the values 0.054,
0.172, 0.325, 0.406 (r = 0.12, x = 0.2,0.4.0.6, 0.8) and
-0.009, -0.086. -0.252. -0.353 (r = 0.24) from the
answer to Prob. 36 to check your answer to Prob. 37.
39. Solve lit = lIxx (0 ~ X ~ I. r ~ 0),
2
lI(X. 0) = x ( I - x), u(O. t) = lI( I, r) = 0 by
Crank-Nicolson with h = 0.2. k = 0.04. 5 time steps.
40. Find the solution of the vibrating string problem lit{ = lIxx•
U(x. 0) = x(l - x), lit = 0, u(O, t) = lI( I. t) = 0 by the
method in Sec. 21.7 with h = 0.1 and k = 0.1 for t = 0.3.
932
CHAP. 21
Numerics for ODEs and PDEs
Numerics for ODEs and PDEs
fn this chapter we discussed numerics for ODEs (Secs. 21.1-21.3) and PDEs
(Secs. 21.4-21.7). Methods for initial value problems
v' = f(x, y),
(I)
Y(Xo)
=
Yo
involving a first-order ODE are obtained by truncating the Taylor series
y(x
+ 11)
= y(x)
+ hr
,
(x)
/7 2
+2
/I
y (x)
+ ...
where, by (I), y' = f, y" = f' = aflilx + (ilflay»)"'. etc. Truncating after the term
Izy', we get the Euler method, in which we compute step by step
(2)
J'n+ 1
= Yn + hf(x", Yn)
(11
= 0, I, ... ).
Taking one more term into account, we obtain the improved Euler method. Both
methods show the basic idea but are too inaccurate in most cases.
Truncating after the term in /7 4 , we get the important classical Runge-Kutta (RK)
method of fourth order. The crucial idea in this method is the replacement of the
cumbersome evaluation of derivatives by the evaluation of f(x. y) at suitable points
(x, y); thus in each step we first compute four auxiliary quantities (Sec. 21.1)
kl
= hf(xno Yn)
k2
= hf(xn + 41z. )'n + 4k1 )
k3
= hf(xn + 4h. Yn + 4k2)
(3a)
and then the new value
(3b)
Error and step size control are possible by step halving or by RKF
(Runge-Kutta-Fehlberg).
The methods in Sec. 21.1 are one-step methods since they get )"n+l from the
result Yn of a single step. A multistep method (Sec. 21.2) uses the values of
Ym Yn-l' ... of severa) steps for computing Yn+l. Integrating cubic interpolation
polynomials gives the Adams-Bashforth predictor (Sec. 21.2)
(4a)
= Yn
+
I
24 h(55fn - 59fn-l
+
37fn-2 - 9fn-3)
933
Summary of Chapter 21
where
(4b)
h
f(Xj, }J)' and an Adams-Moulton corrector (the actual new value)
Yn+I = Yn
+
1
24 h(9f~+1
+
19fn - 5fn-I
+
fn-2)'
where f~+1 = f(Xn+h ."~+I)· Here, to get started, .1'1, ."2' Y3 must be computed by
the Runge-Kutta method or by some other accurate method.
Section 19.3 concerned the extension of Euler and RK methods to systems
y' = f(x, y),
j
thus
= 1, ... , 111.
This includes single mth order ODEs, which are reduced to systems. Second-order
equations can also be solved by RKN (Runge-Kutta-Nystrom) methods. These are
particularly advantageous for .v" = f(x • .1') with f not containing y'.
Numeric methods for PDEs are obtained by replacing pattial derivatives by
difference quotients. This leads to approximating difference equations, for the
Laplace equation to
(Sec. 21.4)
(5)
for the heat equation to
(6)
1
k
I
(Ui,j+I - Uij)
= h2
(Ui+I,j - 2Uij
+ Ui-I,j)
(Sec. 21.6)
and for the wave equation to
(7)
(Sec. 21.7);
here hand k are the mesh sizes of a grid in the x- and y-directions, respectively.
where in (6) and (7) the variable .1' is time t.
These PDEs are elliptic, parabolic, and h}perbolic, respectively. Con'esponding
numeric methods differ. for the following reason. For elliptic PDEs we have
boundary value problems, and we discussed for them the Gauss-Seidel method
(also known as Liebmann's method) and the ADI method (Sees. 21.4, 21.5). For
parabolic PDEs we are given one initial condition and boundary conditions, and we
discussed an nplicit method and the Crank-Nicolson method (Sec. 21.6). For
hyperbolic PDEs, the problems are similar but we are given a second initial condition
(Sec. 21.7).
..
~\'f
.
J
PA RT
.&
,
...
F
,,
-
",
'1
- --
I
-
IIii
•
)
-
-'
•
~
Optimization,
Graphs
C HAP T E R 22
Unconstrained Optimization. Linear Programming
C HAP T E R 23
Graphs. Combinatorial Optimization
Ideas of optimization and application of graphs play an increasing role in engineering.
computer science, systems theory. economics. and Olher areas. In the first chapter of this
part we explain some basic concepts. methods. and results in unconstrained and constrained
optimization. The second chapter is devoted to graphs and the corresponding so-called
combinatorial optimization, a relatively new interesting area of ongoing applied and
theoretical research.
935
~\-:'
...:
CHAPTER
, . Ii.
22
-.IIt:!!!
, .:i!fi3!!iI
....
- _. . .
Jill"' 1
Unconstrained Optimization.
Linear Programming
Optimization principles are of basic importance in modern engineering design and systems
operation in various areas. The recent development has been influenced by computers
capable of solving large-scale problems and by the creation of conesponding new
optimization techniques. so that the entire field has become a large area of its own.
In the present chapter we give an introduction to the more impOltant concepts, methods,
and results on unconstrained optimization (the so-called gradient method) and constrained
optimization (linear programming).
Prerequisite: a modest working knowledge of linear systems of equations
References and Answers to Problems: App. 1 Part F, App. 2.
22.1
Basic Concepts.
Unconstrained Optimization
In an optimization problem the objective is to opti11li~e (l11axil1li~e or 1I1ini111i~e) some
function f. This function f is called the objective function.
For example, an objective function f to be maximized may be the revenue in a production
of TV sets, the yield per minute in a chemical process, the mileage per gallon of a certain
type of car, the hourly number of customers served in a bank, the hardness of steel, or
the tensile strength of a rope.
Similarly, we may want to minimize f if f is the cost per unit of producing certain
cameras, the operating cost of some power plant, the daily loss of heat in a heating system.
the idling time of some lathe. or the time needed to produce a fender.
In most optimization problems the objective function f depends on several variables
These are called control variables because we can "control" them, that is. choose their values.
For example, the yield of a chemical process may depend on pressure Xl and temperature
x 2 • The efficiency of a cel1ain air-conditioning system may depend on temperature XI' air
pressure X2' moisture content X3, cross-sectional area of outlet X4. and so on.
Optimization theory develops methods for optimal choices of XI • ••• , x'" which
maximize (or minimize) (he objective function f, that is, methods for finding optimal
values of Xl • • • • , X n .
936
SEC. 22.1
937
Basic Concepts. Unconstrained Optimization
In many problems the choice of values of Xl, . . . , Xn is not entirely free but is subject
to some constraints, that is. additional restrictions arising from the nature of the problem
and the variables.
For example, if Xl is production cost, then Xl ~ 0, and there are many other variables
(time. weight, distance traveled by a salesman, etc.) that can take nonnegative values only.
Constraints can also have the form of equations (instead of inequalities).
We first consider unconstrained optimization in the case of a function I(x}> ... , X,J.
We also wlite x = (Xl, . . . , xn) and I<x), for convenience.
By definition,
I has a minimum at a point x = Xo in a region
R (where
I is defined)
if
I(x)
for all x in R. Similarly,
~
I(X o)
I has a maximum at Xo in R if
for all x in R. Minima and maxima together are called extrema.
Furthermore, I is said to have a local minimum at Xo if
for all x in a neighborhood of Xo, say, for all x satisfying
where Xo = (Xl, ... , Xn) and r > 0 is sufficiently small.
Similarly, I has a local maximum at Xo if I(x) ~ I(X o ) for all x satisfying
Ix - xol < r.
If I is differentiable and has an extremum at a point Xo in the interior of a region R
(that is, not on the boundary), then the partial derivatives ai/axI' ... , aflax., must be
zero at Xo. These are the components of a vector that is called the gradient of I and
denoted by grad lor VI. (For 11 = 3 this agrees with Sec. 9.7.) Thus
(1)
A point Xo at which (1) holds is called a stationary point of I.
Condition (1) is necessary for an extremum of I at Xo in the interior of R, but is not
sufficient. Indeed, if 11 = 1. then for -" = Ilx), condition ll) is y' = !' (Xo) = 0; and, for
instance, )' = x 3 satisfies y' = 3x2 = 0 at X = Xo = 0 where I has no extremum but a
point of inflection. Similarly, for I(x) = XIX2 we have VI(O) = 0, and I does not have
an extremum but has a saddle point at O. Hence after solving (I), one must still find out
whether one has obtained an extremum. In the case 11 = I the conditions y' (Xo) = 0,
-,,"(Xo) > 0 guarantee a local minimum at Xo and the conditions y' (xo) = 0, -,,"(Xo) < 0
a local maximum, as is known from calculus. For 11 > 1 there exist similar criteria.
However. in practice even solving (1) will often be difficult. For this reason, one generally
prefers solution by iteration, that is, by a search process that starts at some point and
moves stepwise to points at which I is smaller (if a minimum of I is wanted) or larger
(in the case of a maximum).
938
CHAP. 22
Unconstrained Optimization. Linear Programming
The method of steepest descent or gradient method is of this type. We present it here
in its standard form. (For refinements see Ref. [E25] listed in App. 1.)
The idea of this method is to find a minimum of f(x) by repeatedly computing minima
of a function get) of a single vruiable t, as follows. Suppose that .f has a minimum at Xo
and we strut at a point x. Then we look for a minimum of f closest to x along the straight
line in the direction of - vf(x). which is the direction of steepest descent (= direction of
maximum decrease) of fat x. That is, we determine the value of t and the corresponding
point
(2)
z(t)
=
x -
tV f(x)
at which the function
g(t)
(3)
= f(z(t»
has a minimum. We take this z(t) as our next approximation
E X AMP L E 1
to
Xo'
Method of Steepest Descent
Determinc a minimum of
(4)
starting ti'01TI Xo
= (6. 3) = 6i + 3j and applying the method of steepest descent.
Solutioll.
Clearly. inspection shows that j(x) has a minimum at tI. Knowing the solution gives u~ a better
feel of how the method worh. We obtain,fIx) = 2Xl i + 6x2j and from this
z(t) = x - t'f(x) = (\ - 2tlx1 i
g(t) =
f(zv»
+ (I - 6rlx2 j
2
2 2
= (1 - 2t) x]2 + 3(1 - 6t) X2 .
We now calculate the derivative
set g' (t) = O. and solve for t. finding
Starting from Xo = 6i + 3j, we compute the value~ in Table 22.1. which are shown in Fig. 472.
Figure 472 suggests that in the case of slimmer ellipses C'a long narrow valley"). convergence would be poor.
You may confirm this by replacing the coefficient 3 in (4) with a large coefficieIlt. For more sophisticated
descent and other methods. some of them also applicable to vector functions of vector variables. we refer to the
references lIsted in Part F of App. 1: see also [E25j.
•
x2
Fig. 472.
Method of steepest descent in Example 1
Linear Programming
SEC. 22.2
939
Table 22.1
Method of Steepest Descent, Computations in Example 1
X
II
0
6.000
3.484
1.327
0.771
0.294
0.170
0.065
I
2
3
4
5
6
3.000
-0.774
0.664
-0.171
0.147
-0.038
0.032
0.210
0.310
0.210
0.310
0.210
0.310
STEEPEST DESCENT
Do 3 steepest descent steps when:
3. f(x) = 3X I 2 + 2X22 - 12xl
S. f(X) =
Xo =
6.
+ 16x2, Xo = [I l]T
+ 2x/ - Xl - 6x2 , Xo = [0 OJT
0.5X12 + 0.7X22 - Xl + 4.2x2 + I,
X1
[-I
f(x) =
.\"0 = [2
2
I]T
x/ +
0.lx22
+
8Xl
+
X2
+ 22.5.
-I]T
7. f(x) = 0.2x/ + x/ - 0.08Xl' Xo = [4 4]T
8. fIx) = X1 2 - X22. Xo = [2 I ]T. 5 steps. First guess.
Then compute. Sketch your path.
22.2
0.581
0.381
0.581
0.38]
0.581
0.381
-0.258
-0.S57
-0.258
-0.857
-0.258
-0.857
10. fIx) = Xl 2 - X2' Xo = [I l]T. Sketch your path.
Predict the outcome of further steps.
11. .f(x) = aXI + bX2' any "0. First guess. then compute.
,3-111
=
I - 6t
9. f(x) = X1 2 + cx/. Xo = [c I ]T. Show that 2 steps
give [c llT times a factor. -4c2 /(c 2 - 1)2. What
can you conclude from this about the speed of
convergence?
1. What happens if you apply the method of steepesl
descent to fIx) = X1 2 + X2 2 ?
2. Verify that in Example I. successive gradients are
orthogonal. What is the reason?
4. f(x)
I - 2t
12. CAS EXPERIMENT. Steepest Descent. (a) Write a
program for the method.
(b) Apply your program to f(x) = X1 2 + 4.\"22.
experimenting with respect to speed of convergence
depending on the choice of Xo .
(c) Apply your program to fIx) = .\"12 + X24 and to
fIx) = X14 + X24, Xo = [2 I]T. Graph level curves
and your path of descent. <Try to include graphing
directly in your program.)
Linear Programming
Linear programming or linear optimization consists of methods for solving optimization
problems with constrai1Zts, that is, methods for finding a maximum (or a minimum)
x = [Xl' ••• , xnl of a li1Zear objective function
satisfying the constraints. The latter are linear inequalities, such as 3x1 + 4X2 :;; 36, or
0, etc. (examples below). Problems of this kind arise frequently, almost daily, for
instance, in production, inventory management, bond trading, operation of power plants.
routing delivery vehicles, airplane scheduling, and so on. Progress in computer technology
has made it possible to solve programming problems involving hundreds or thousands or
more variables. Let us explain the setting of a linear programming problem and the idea
of a "geometric" solution, so that we shall see what is going on.
Xl ~
940
EXAMPLE 1
CHAP. 22
Unconstrained Optimization. Linear Programming
Production Plan
Energy Savers. Inc .. produces heaters of types Sand L. The wholesale price is $40 per heater for Sand $88 for
L. Two time constraints result from the use of two machines M1 and M 2. On M1 one needs 2 min for an S heater
and g min for an L heater. On M2 one needs S min for an S heater and 2 min for an L heater. Determine
production figures Xl and X2 for Sand L. respectively (number of heaters produced per hour) so that the hourly
revenue
~ =
fer)
= 40X1
+ 88x2
is maximum.
Solution.
Production figures Xl and X2 must be nonnegative. Hence the objective function (to be maximized)
and the four constraints are
(0)
(1)
2\~1
+ 8x2
~
60 min time on machine
(2)
SX1
+ 2x2
~
60 min time on machine M2
~
(3)
Ml
0
(4)
Figure 473 shows (0)--(4) as follows.
Con~tancy
lines
Z = COllst
are marked (0). These are lines of constant reHnue. Their slope is -40/88 = - SIll. To increase ~ we must
move the line upward (parallel to itself), as the arrow shows. Equation (I) with the equality sign is marked
(1) It intersects the coordinate axes at Xl = 6012 = 30 (set x2 = 0) and x2 = 60/8 = 7.S (set Xl = 0). The
arrow marks the side on which the points (Xl. x2) lie that satisfy the inequality in (I). Similarly for Eqs.
(2)-(4). The blue quadrangle thus obtained is called the feasibilit) region. It is the set of all feasible
solutions. meaning solutions that satisfy all four constraints. The figure also lists the revenue at O. A. B. C.
The optimal solution is obtained by moving the line of constant revenue up as much as possible without
leaving the feasibility region completely. Obviously. this optimum is reached when that line pas~es through
B. the intersection (10. S) of (I) and (2). We see that the optimal revenue
';:max = 40· 10
is obtained by producing twice as many S
heater~
+ 88· S = $840
•
as L heaters.
x2
0: z=O
A: z = 40·12 = 480
B: z = 40 . 10 + 88 . 5 = 840
C: z=88·7.5=660
20
(3)
(4)
"
o
"
10
A
20
""""(0)
~""O
"",~O
~
""'(O)~
"'"" ""8</0
Fig. 473.
Linear programming in Example 1
SEC 22.2
941
Linear Programming
Note well that the problem in Example I or similar optimization problems call1lot be
solved by setting certain partial derivatives equal to zero. because crucial to such problems
is the region in which the control variables are allowed to vary.
Furthermore, our "geometric" or graphic method illustrated in Example I is confined
to two variables Xl, x2' However, most practical problems involve much more than two
variables, so that we need other methods of solution.
Normal Form of a Linear Programming Problem
To prepare for general solution methods. we show that constraints can be written more
uniformly. Let us explain the idea in terms of (1),
This inequality implies 60 -
2XI -
8x2
;;:;
0 (and conversely), that is, the quantity
is nonnegative. Hence, our original inequality can now be written as an equation
where
is a nonnegative auxiliary variable introduced for converting inequalities to equations.
Such a variable is called a slack variable, because it "takes up the slack" or difference
between the two sides of the inequality.
X3
E X AMP L E 2
Conversion of Inequalities by the Use of Slack Variables
With the help of two slack variables
following form. Maximize
\-3' .\"4
we can write the linear programming problem in Example I in the
subject to the constraints
1, ... , 4).
(i =
We now have /I = 4 variables and III = 2 (linearly independent) equations. so that two of the four variables.
for example. Xl' x2. determine the others. Also note that each of the four sides of the quadrangle in Fig. 473
now has an equation of the form Xi = 0:
OA: x2 = O.
AB:
x4 =
BC:
X3
0,
= 0,
CO: Xl = O.
A vertex of the quadrangle is the intersection of two sides. Hence at a vertex. n - 1/1 = 4 - 2
variables are /ero and the others are nonnegallve. Thus at A we have x2 = 0, \"4 = 0, and so on.
=
2 of the
•
942
CHAP. 22
Unconstrained Optimization. Linear Programming
Our example suggests that a general linear optimization problem can be brought to the
following normal form. Maxill1i~e
(5)
subject to tlte cOllStrail1ts
(6)
Xi
~
0
(i
=
I, ... ,
11)
with all bj nonnegative. (If a bj < O. multiply the equation by -1.) Here Xl' .... Xn
include the slack variables (for which the c/s in f are zero). We assume that the equations
in (6) are linearly independent. Then, if we choose values for 11 - m of the variables. the
system uniquely detelmines the others. Of course. since we must have
Xl ~
0, ... , Xu
:;;
0,
this choice is not entirely free.
Our problem also includes the minimization of an objective function f since this
corresponds to maximizing - f and thus needs no separate consideration.
An n-tuple (Xl' ••• ,xn ) that satisfies all the constraints in (6) is called afeasible poillt
or feasible solution. A feasible solution is called an optimal solution iffor it the objective
function .f becomes maximum. compared with the values of f at all feasible solutions.
Finally, by a basic feasible solution we mean a feasible solution for which at least
11 - III of the variables Xl' ••• , X" are zero. For instance, in Example 2 we have 11 = 4,
III = 2. and the basic feasible solutions are the four vertices 0, A. B. C in Fig. 473. Here
B is an optimal solution (the only one in this example).
The following theorem is fundamental.
THEOREM 1
,--------------------------------------Optimal Solution
Some optimal solutioll of a lil1ear progra/1lmil1g problem (5). (6) is also a basic
feasible .I'nlutimz of (5), (6).
J
For a proof, see Ref. [F5], Chap. 3 (listed in App. I). A problem can have many optimal
solutions and not all of them may be basic feasible solutions: but the theorem guarantees
that we can find an optimal solution by searching through the basic feasible solutions
only. This is a great simplification; but since there are (
Il) (Il)
1/ - I l l
III
different ways
of equating 11 - III of the 11 variables to zero, considering all these possibilities. dropping
those which are not feasible and then searching through the rest would still involve very
much work. even when 11 and /1l are relatively small. Hence a systematic search is needed.
We shall explain an important method of this type in the next section.
SEC 1l.2
943
Linear Programming
1. What is the meaning of the slack variables X3' X4 in
Example 2 in terms of the problem in Example I?
2. Can we always expect a unique solution (as is the case
in Example L)?
3. Could we find a profit f(XI' X2) = QIXI + Q2X2 whose
maximum is at an interior point of the quadrangle in
Fig. 473? (Give a reason for your answer.)
4. Why are slack variables always nonnegative? How
many of them do we need?
REGIONS AND CONSTRAINTS
15-101
Describe and graph the region in the first quadrant of the
x l x2-plane determined by the inequalities:
5. Xl +
2X2 ~
IO
X2 ~
0
X2 ::;;
2
Xl -
+
7. 2.0XI
5.0XI +
8. 2XI
4XI
Xl
9.
6.0X2 ~
18.0
2.5x2 ~
20.0
X2 ::;;
6
+
5X2 ~
40
-
2X2 ::;;
-3
-
Xl
+
X2 ::;;
3
Xl
+
X2 ~
9
10.
+
X2 ~
X2 ::;;
2
+
5X2 ::;;
15
X2 ::;;
-2
2X2 ~
LO
3x I
2x) -
-Xl +
111-151
MAXIMIZATION AND MINIMIZATION
Maximize the given objective function
given constraints.
11.
f
=
-lOx) +
2X2,
16. (Maximum output) Giant Ladders, Inc., wants to
maximize its daily total output of large step ladders by
producing Xl of them by a process PI and X2 by a
process P 2 , where PI requires 2 hours of labor and 4
machine hours per ladder, and P2 requires 3 hours of
labor and 2 machine hours. For this kind of work, 1200
hours of labor and 1600 hours on the machines are at
most available per day. Find the optimal Xl and X2'
18. (Minimum cost) Hardbrick, Inc., has two kilns. Kiln
I can produce 3000 grey bricks, 2000 red bricks, and
300 glazed bricks daily. For Kiln II the corresponding
figures are 2000, 5000, and 1500. Daily operating costs
of Kilns I and II are $400 and $600. respectively. Find
the number of days of operation of each kiln so that
the operation cost in filling an order of 18000 grey,
34000 red, and 9000 glazed blicks is minimized.
3
Xl +
15. Minimize f in Prob. 11.
17. (Maximum profit) Universal Electric, Inc ..
manufactures and sells two models of lamps, LI and
L 2 , the profit being $150 and $100, respectively. The
process involves two workers WI and W2 who are
available for this kind of work 100 and 80 hours per
month, respectiveLy. WI assembles LI in 20 min and
L2 in 30 min. W 2 paints LI in 20 min and L2 in 10 min.
Assuming that all lamps made can be sold without
difficulty. determine production figures that maximize
the profit.
-Xl +'\2 ::;; -3
-Xl
14. Minimize fin Prob. 13.
Xl::;; 0,
f subject to the
X2::;; 0,
19. (Maximum profit) United MetaL Inc .. produces alloys
B) (special brass) and B2 (yellow tombac). BI contains
50% copper and 50% zinc. (Ordinary brass contains
about 65% copper and 35% zinc.) B2 contains 75%
copper and 25% zinc. Net profits are $120 per ton of
Bl and $100 per ton of B 2 . The daily copper supply is
45 tons. The daily zinc supply is 30 tons. Maximize
the net profit of the daily production.
20. (Nutrition) Foods A and B have 600 and 500 calories,
contain L5 g and 30 g of protein, and cost $1.80 and
$2.10 per unit, respectively. Find the minimum cost
diet of at least 3900 calories containing at least 150 g
of protein.
944
22.3
CHAP. 22
Unconstrained Optimization. Linear Programming
Simplex Method
From the last section we recall the following. A linear optimization problem (linear
proglamming problem) can be written in normal form; that is:
Maximize
(1)
subject to the cOllstraillts
(2)
Xi
~
0
(i
= I. .... 11).
For finding an optimal solution of this problem. we need to consider only the basic feasible
solutions (defined in Sec. 22.2), but there are still so many that we have to follow a
systematic search procedure. In 1948 G. B. Dantzig published an iterative method, called
the simplex method, for that purpose. In this method, one proceeds stepwise from one
basic feasible solution to another in such a way that the objective function f always
increases its value. Let us explain this method in terms of the example in the last section.
In its original form the problem concerned the maximization of the objective function
z
subject to
=
40x 1
+
88x2
2'\'1
+
&-2 ~ 60
X2 ~
O.
Converting the first two inequalities to equations by introducing two slack variables X3,
we obtained the normal form ofthe problem in Example 2. Together with the objective
function (written as an equation;:: - 40X1 - 8!h2 = 0) this normal fonn is
X4'
=
(3)
0
= 60
where Xl ~ 0, ... , .1.'4 ~ O. This is a linear system of equations. To find an optimal
solution of it, we may consider its augmented matrix (see Sec. 7.3)
b
(4)
SEC. 22.3
945
Simplex Method
This matrix is called a simplex tableau or simplex table (the illitial simplex table). These
are standard names. The dashed lines and the letters
-
b
~,
are for ease in further manipulation.
Every simplex table contains two kinds of variables .r} By basic variables we mean
those whose columns have only one nonzero entry. Thus X3' X4 in (4) are basic variables
and Xl' -'"2 are nonbasic variables.
Every simplex table gives a basic feasible solution. It is obtained by setting the nonbasic
variables to zero. Thus (4) gives the basic feasible solution
.\"3
= 60/1 = 60,
X4
=
z=o
60/1 = 60,
with .\"3 obtained from the second row and .\"4 from the third.
The optimal solution (its location and value) is now obtained stepwise by pivoting,
designed to take us to basic feasible solutions with higher and higher values of::: until the
maximulll of z is reached. Here, the choice of the pivot equation and pivot are quite
different from that in the Gauss elimination. The reason is that Xl> X2, .\"3, X4 are restricted
to nonnegative values.
Step 1. Operatioll 0 1 : Selection of the Column of the Pivot
Select as the colunrn of the pivot the first column with a negative entry in Row 1. In (4)
this is Column 2 (because of the -40).
Operatioll O 2 : Selection of the Row of the Pivot. Divide the right sides [60 and 60 in
(4)] by the corresponding entries of the column just selected (6012 = 30. 60/5 = 12).
Take as the pivot equation the equation that gives the smallest quotient. Thus the pivot
is 5 because 60/5 is smallest.
Operatioll 0 3 : Elimination by Row Operations.
pivot (as in Gauss-Jordan, Sec. 7.8).
This gives zeros above and below the
With the notation for row operations as introduced in Sec. 7.3, the calculations in Step
give from the simplex table To in (4) the following simplex table (augmented matrix).
with the blue letters referring to the previous table.
z
(5)
[
b
-~--t--~--~z~.;-t-~---~~:-t-j!~-l
I
o
:
I
5
Row I
= 60/5 = 12,
8 Row 3
Row 2 - 0.4 Row 3
I
2: 0
:
60
We ,ee that basic variables are now Xh X3 and nonbasic variables are
latter to zero, we obtain the basic feasible solution given by T 1,
Xl
+
X3
= 36/1 = 36,
-1."2, .\"4.
Setting the
::: =
480.
This is A in Fig. 473 (Sec. 22.2). We thus have moved from 0: (0, 0) wIth::: = 0 to A:
(12, 0) with the greater::: = 480. The reason for this increase is our elimination of a term
(-40.1."1) with a negative coefficient. Hence elimillation is applied only to Ilegative e1ltries
in Row I but to no others. This motivates the selection of the COIUIIlIl of the pivot
946
CHAP. 22
Unconstrained Optimization. Linear Programming
We now motivate the selection of the row of the pivot. Had we taken the second row
of To instead (thus 2 as the pivot), we would have obtained z = 1200 (verify!), but this
line of constant revenue ~ = 1200 lies entirely outside the feasibility region in Fig. 473.
This motivates our cautious choice of the entry 5 as our pivot because it gave the smallest
quotient (60/5 = 12).
Step 2. The basic feasible solution given by (5) is not yet optimal because of the negative
entry -72 in Row 1. Accordingly, we perform the operations 0 1 to 0 3 again, choosing
a pivot in the column of -72.
Operatioll 0 1 , Select Column 3 of T 1 in (5) as the column of the pivot (because -72 < 0).
Operation 02' We have 3617.2 = 5 and 60/2 = 30. Select 7.2 as the pivot (because 5 < 30).
Operation 0 3 , Elimination by row operations gives
z
T2
(6)
=
b
~-~-i--~--~~-f--J~----~~.~-i-~:~-]
o
:5
I
I
0
:
I
I
1:
3.6
We see that now Xl' X2 are basic and x 3 ,
from T 2 the basic feasible solution
Xl
= 50/5 = 10,
X2
+
10 Row 2
1
50
I
I
0.9
Row I
Row 3
""'::"-Row 2
7.'2
nonbasic. Setting the latter to zero, we obtain
X4
= 36/7.2 = 5,
.: = 840.
This is B in Fig. 473 (Sec. 22.2). In this step . .: has increased from 480 to 840. due to the
elimination of -72 in T l . Since T2 contains no more negative entries in Row I, we
conclude that z = f(10, 5) = 40 . 10 + 88 . 5 = 840 is the maximum po:"sible revenue.
It is obtained if we produce twice as many S heaters as L heaters. This is the solution of
our problem by the simplex method of linear programming.
•
Minimization. If we want to millimize <: = f(x) (instead of maximize), we take as the
columns of the pivots those whose entry in Row I is positive (instead of negative). In
such a Column k we consider only positive entries Tjk and take as pivot a ~ik for which
b/tjk is smallest (as before). For examples, see the problem set .
•...
_-_ .. _
_.......-- --- ...........
- .........
_.......
.... --.
~
-=9l
L _
SIMPLEX METHOD
Write in nonnal form and solve by the simplex method,
assuming all xJ to be nonnegative.
1. Maximize.f = 3Xl + 2X2 subject to 3x1 + 4.l2 ~ 60.
4Xl + 3X2 ~ 60.
IOxl + 2X2 ~ 120.
2. Prob. 16 in Problem Set 22.2.
3. Maximize the profit in the daily production of XI metal
frames Fl ($90 profit/frame) and X2 frames F2 ($50
profit/frame) under the restrictions XI + 3X2 ;§; 1800
(material). Xl + .\"2 ;§; 1000 (machine hours),
3Xl + X2 ;§; 2400 (labor).
4. Maximize f =
X\
+
X2
+
2Xl
X3 ;§;
+ 3X2 +
4.8, lOxI +
\"3
subject to
9.9, -"2 -
X3 ~
X3 ;§;
0.2.
5. The problem in the text with the order of the constmims
interchanged.
6. Minimize f =
4Xl -
3.\"1
+
4X2
+
5.\"3
2Xl
+
3X3
~
30.
~
IOx2 - 20.\"3 subject to
60, 2.\"1 + .\"2 ~ 20,
7. MinimiLe f = 5Xl - 20X2 subject to 2Xl + 5x2 ~ 10.
8. Prob. 20 in Problem Set 22.2.
2Xl
+ 1Ox2 ;§; 5,
SEC. 22.4
Simplex Method:
9. Maximize
f
34xI
=
+ 2X2 +
Xl + X2 + 5X3
X3 ~
8Xl
~
+
947
Difficulties
+
29x2
54. 3xI
+
32x3
8X2
+
(b) Write a program for maximizing:::
in R.
subject to
59.
2X3 ~
39.
= {{IXI
+ 1I2X2
(el Write a program for maximi7ing
::: = {{lXI + ... + lInX" subject to linear constraints.
10. CAS PROJECT. Simplex Method. (a) Write a
program for graphing a region R in the first quadrant
(d) Apply your programs to problems in this problem
of the xlx2-plane determined by linear constraints.
set and the previous one.
22.4 Simplex Method:
Difficulties
We recall from the last section that in the simplex method we proceed stepwise from one
basic feasible solution to another, thereby increasing the value of the objective function
f until we reach an optimal solution. Occasionally (but rather infi'equently in practice).
two kinds of difficulties may occur.
The first of these is degeneracy. A degenerate feasible solution is a feasible solution
at which more than the usual number 11 - 111 of variables are zero. Here 11 is the number
of variables (slack and others) and 111 the number of constraints (not counting the '~i ~ 0
conditions). Tn the last section, 11 = 4 and 111 = 2, and the occurring basic feasible solutions
were nondegenerate; n - 111 = 2 variables were zero in each such solution.
In the case of a degenerate feasible solution we do an extra elimination step in which
a basic variable that is zero for that solution becomes nonbasic (and a nonbasic variable
becomes basic instead). We explain this in a typical case. For more complicated cases
and techniques (rarely needed in practice) see Ref. [F5J in App. I.
E X AMP L E 1
Simplex Method, Degenerate Feasible Solution
AB Steel. Inc .. produces two kinds of iron 11' '2 by using three kinds ot raw matenal R I . R2• R3 (scrap iron and
two kind, of ore) as shown. Maximi7e the daily profit.
Raw Material Needed
per Ton
Raw
Material
[ron 11
iron 12
2
I
Rl
R2
R3
Raw Material Available
per Day (tons)
16
I
8
3.5
0
Net profit
per ton
$150
$300
Let Xl and X2 denote the amount (in tons) of iron '} and 12 , respectively. produced per day. Then
our problem is as follows. Maximize
SoIZlti01l.
;: = I(x) =
subject
to
the constraints Xl
~
O. -'"2
~
150Xl
+
300X2
0 and
(raw material R I )
(raw material R 2 )
(raw material R3 ).
948
CHAP. 22
Unconstrained Optimization. Linear Programming
By introducing slack
variable~ X3' X4. X5
we obtain the normal form of the
con~tr.lints
16
8
(1)
Xi;;;;
I. .... 5).
(i =
0
As in the last section we obtain from (\ ) and (2) the initial simplex tahle
b
[
-~l-~~Q-~J~O-l-~---~---~-r~~-l
To =
(3)
I
I
o:
o II
:
0
I
I
0
0
I .
0 :
Thi~
=
1611 = 16.
is 0: (0.0) in Fig. 474. We have
11
X5
= 5 variables Xj'
m
3.5
= x2 = 0 we have frolll (3) the basic
We see that xl. x2 are nonbasic variables and x3' x4' x!) are basic. With Xl
feasible solution
X3
H
I
I
0
= 3.511 = 3.5.
= 3 constraints, and 11
-
111
;: =
O.
= 2 variables equal
to zero in our ",Iution. which thus is nondegenerate.
Step 1 of Pivoting
Operatioll 0
1:
Column Selection of Pivot. Column 2 hince -150 < 0).
Operatioll O 2: Row Selection of Pivot. 1612 = H, 811 = 8; 3.510 is not possible. Hence we could choose
Row 2 or Row 3. We choose Row 2. The pivot is L
Operation 0 3: Elimination by Row Operations. This gives the simplex table
Xl
X3
-"2
X4
b
X5
------------------------------75
0 I
0 -225
0
(4)
Tl =
[~
0
2
0
0
1
2
1
0
-2
0
0
0
I
I
I
I
I
I
I
I
We see that the basic variables are X], X4.'\5 and the nonbasic are x2'
we obtain from T] the basic feasible solution
I~ 1
Rm, I
'5 Rm\
Rm
4 Ro'
16
0
3.5
t3'
Row 4
Setting the nonoosic variables to rero,
SEC. 22.4
Simplex Method:
949
Difficulties
Xl =
1611 = 8.
X4
= 0/1 = O.
X5
= 3.5/1 = 3.5.
~ =
1200
This is A: (8. 0) in Fig. 474. This solution in degenerate because X4 = 0 (in addition to X2 = O. x3 = 0):
geometrically: the straight line x4 = 0 also pa"es through A. This requires the next step. in which x4 will become
nonbasic.
Step 2 of Pivoti1lg
Operatioll 0 1 : Column Selection of Pivot. Column 3 (since -225 < 0).
Operatioll O 2: Row Selection of Pivot. 16/1 = 16.0/! = O. Hence! must serve as the piVOL
Operatioll 0 3 : Elimination by Row Operations. This gives the following simplex table.
h
~ _:_ Q- - -
.9_:_-::1JQ. - - ~-?.O- - - - Q. _~!~Q_]
Ro",
o :
0 :
Row
(5)
2
I
2
I
[ 0: 0
!:
o :
0 :
0
-2
0:
I()
I
.! Rm .
I
-!
I
0:
- 2
:
0
3.5
Row -I- - 2 Row .•
We see that the basic variables are Xl. x2. x5 and the nonbasic are x3' x4' Hence x4 has become nonbasic. as
intended. By equating the nonbasic variables to /ero we obtain from T 2 the basic feasible solution
Xl
= 16/2 = 8.
-'2 =
O/! =
o.
-'5 =
3.511 = 3.5.
;: =
1200.
This is still A: (8. 0) in Fig. 474 and;: has not increa,ed. But this opens the way to the maximum. which we
reach in the next step.
Step J of Pivoting
Operatioll 0 1 : Column Selection of Pivot. Column 4 (since -150 < 0).
Operatioll O 2: Row Selection of Pivot. 16/2 = 8. O/(-!) = O. 3.511 = 3.5. We crultakc I as the pivot. (With
-! as the pivot we would nm leave A. Try iL)
Operatioll 0 3 : Elimination by Row Operations. This gives the simplex table
b
_1_ ~_ Q(6)
[
o :
I
o II
o :
2
0: 0
0
!
I
I
I
0
0:
We see that basic variables are
basic feasible solution
Xl =
- - Q- ~ _0__ J~O___ !?Q - ~ J 2?J__ ]
2
Xl' X2' X3
0
2
-2:
0
1
-2
-2
and non basic X4'
912 = 4.5.
X3 =
X5'
9
I
I
I
1.75
:
3.5
Row
150 Row-l-
Row
2 Row-l-
Equating the latter to zero we obtain from T3 the
3.5/1 = 3.5.
;: =
1725.
This is B: (-1-.5.3.5) in Fig. 474. Since Row of T3 has no negative entries. we have reached the maximum
daily profit ::max = f(4.5. 3.5) = 150' 4.5 + 300' 3.5 = SI725. This is obtained by u~ing 4.5 tons of iron h
~U~cl~~
•
Difficulties in Starting
As a second kind of difficulty, it may sometimes be hard to find a basic feasible solution
to start from. In such a case the idea of an artificial variable (or several such variables)
is helpful. We explain this method in terms of a typical example.
950
E X AMP L E 2
CHAP. 22
Unconstrained Optimization. Linear Programming
Simplex Method: Difficult Start, Artificial Variable
Maximize
(7)
subject to the constraints Xl
Solutioll.
~
O.
X2 ~
0 and (Fig. 475)
By means of ~Iack variables we achieve the normal foml of the constrrnnts
=0
=
1
=2
(8)
Xi ~
0
(i =
I, .... 51.
Note that the first slack variable is negative (OT zero). which makes \3 nonnegative within the
(and negative outside). From (7) and (8) we obtain the simplex table
fea~ibilit)
region
b
r--L- -~:-l-
-!o I
I
o II
ro :
I
-2
I
-I
_Q ___<L __ ~_ T-~-1
I
I
I
I
-1
0
0
0
I
0
:
0
0
I
1
I .
I :2
I
:
4
are nonbasic. and we would like to take x3. '\4. X5 as basic variables. By our usual process of equating
the nonbasic variables to zero we obtain from this table
Xl. X2
Xl =
O.
.1."3 =
1I(-l) = -1,
X4 =
211 = 2.
"5 =
4/1 = 4 .
~ =
O.
"3 < 0 indicates that (0, 0) lies outside the feasibility region. Since X3 < O. we cannot proceed immediately.
Now. instead of searching for other basic variables. we use the following idea. Solving the second equation in
(8) for .1."3. we have
To this we now add a variable
X6
on the right.
B
2
f=7
\
\
\
\
\
~ C
o ,------A,
o
Fig. 475.
1
~
2
3
Xl
Feasibility region in Example 2
SEC. 22.4
Simplex Method:
Difficulties
951
(9)
is called an artificial variable and is subject to the constraint X6 ~ O.
We must take care that x6 (which is not part of the given problem!) will disappear eventually. We shall see
that we can accomplish this by adding a term -MX6 with very large M to the objective function. Because of
(7) and (9) (solved for x6) this gives the modified objective function for this "extended problem"
X6
(10)
We see that the simplex table corresponding to (10) and (8) is
b
\"1
I L
I -2 - M
-I +!M
__
______________
MOO
0
_________________
-!
:
-I
-I
I
I
I
0
0:
:
0
I
I
o:
I
oI
To =
I
~
I
-!
oI
I
0
0
0:
0
0
0
0
- I
I
~
-M_
___
I
I
I
2
0:
4
I
0
I
The last row of this table results from (9) written as Xl - !X2 - x3 + x6 = 1. We see that we can now start.
taking x4. x5. X6 as the basic variables and Xl' X2. X3 a, the nonbasic variables. Column 2 has a negative first
entry. We can take the second entry (1 in Row 2) as the pivot. This gives
b
I
I
0
-2
o :
I
-!: -
I
-2
0
0
0
I
2
0
0
0:
I
I
---,-------,------------------T--I
T1
=
I
-!:
I
0:
0
0
0:
0
O:O~:
0:3
I
10
I
I
01001000
This corresponds to Xl = 1,.1"2 = 0 (point A in Fig. 475). x3 = 0, -"4 = I,
Row 5 and Column 7. In this way we get rid of \"6' as wanted. and obtain
X5
= 3. X6 = O. We can now drop
b
-~ -t -~- -~! -t- ~~ ---~ ---~-l- ~-j
r
I
I
I 0
_1 I
2
I
I
I
3
I
01021
o
1
I
0
0
I
I I
I
I
13
In Column 3 we choose 312 as the next pivot. We obtain
b
This corresponds to Xl = 2. -'"2 = 2 (this is B in Fig. 475). X3
as the pivot, by the usual principle. This gives
= O. X4
= 2, X5
= O. In Column 4 we choose 4/3
CHAP. 22
952
Unconstrained Optimization. Linear Programming
b
[
This cone5ponds to Xl
=
3.
-~-1-~----~-t-~---!---!-1-~-1
I
I
0:0
O:~
o
X2 =
I
I
0
3
I
"I
I .
~:2
1
3
I
3
"I"
3
()
1 (point C in Fig. 475).
-"
-"3 =
~. x4 = O. x5 = O. This is the maximum
fmax = I(3. 1) = 7.
-............... _.....
-.•........
•
2E4
~
If in a step you have a choice between pivots. take the one
that comes first in the column considered.
1. Maximize:;; = fleX) = 6xl + 12x2 subject to
o ~ Xl ::0; 4. 0::0; X2 ~ 4. 6x1 + 12x2 ::0; 72.
2. Do Prob. I with the last two constraints interchanged.
3. Maximi7e the daily output in producing Xl glass plates
by a process PI and X2 glass plates by a process P2
subject to the constraints (labor hours. machine hours,
raw material supply)
to input constraints /limitation of machine time)
4Xl
8X1
+
2X2
+ 4X2
X2 ;;;
4. Maximize:;; = 300.\"1 + 500.\"2 subject to
2Xl + 8X2 ~ 60. 2X1 + .\"2 ~ 30. 4XI + 4.\"2 ~ 60.
5. Do Prob. 4 with the last two constraints interchanged.
Comment on the resulting simplification.
6. Maximize the total output f = Xl + X2 + X3 (production
figures of three different production processes) subject
5X2
+
+
8X3
4X3
::0; 12,
::0; 12.
::0; 40.
9. Maximize f
::0; 140.
5X2
7. Maximize f = 6.\"1 + 6.\"2 + 9.\"3 subject to
.\"j ;;; 0 (j = I. .... 5), and Xl + .\"3 + X4 = 1.
X2 + .\"3 + \"5 = I.
8. Using an artificial variable. minimize f = 2x 1 - X2
subject to Xl ;;; O. X2 ;;; 0, X1 + X2 ;;; 5, -Xl + X2 ~ I,
5XI
4Xl
+
+
O.
X3
=
::0; 0,
+ X2 +
+ X2 + x3
4Xl
2X3
Xl
::0; 1.
subject to Xl ;;;
Xl
+ X2
.\"3
o.
~ O.
10. If one uses the method of artificial variables in a
problem without solution. this nonexistence will
become apparent by the fact that one cannot get rid of
the artificial variable. TIIustrate this by trying to
maximize f = 2Xl + X2 subject to Xl ;;; 0, X2 ;;; 0,
2Xl + X2 ::0; 2, Xl + 2X2 ;;; 6, Xl + X2 ~ 4.
==::,.:.,::.===:=:--:e-".;::&':==.:U S T ION SAN D PRO B L EMS
1. What is the difference between constrained and
unconstrained optimization?
2. State the idea and the basic formulas of the method of
steepest descent.
3. Write down an algorithm for the method of steepest
descent.
4. Design a "method of steepest ascent" for determining
maxima.
5. What i~ linear programming?
objective function?
lt~
ba~ic
idea? An
6. Why can we not use methods of calculus for extrema
in linear programming?
f(x) = X1 2 + 1.5X22. starting from (6. 3). Do 3 steps.
Why is the convergence faster than in Example 1.
Sec. 22.1?
9. What does the method of steepe~t de~cent amount to in
the case of a single variable?
10. In Prob. 8 start from Xo = [1.5 I ]T. Show that the next
even-numbered approximations are X2 = kXo. X4 = k 2xo.
etc .. where k = OJl4.
11. What happens in Example I of Sec. 22.1 if you replace
the function fex) = X 12 + 3X22 by fex) = x 12 + 5X22?
Do 5 steps, starting from Xo = [6 3]T. Is the
convergence faster or slower?
7. Whar are slack variables? Artificial variables? Why did
we use them'?
12. Apply the method of steepest descent to
f(x) = 9X12 + X2 2 + 18.\'1 - 4X2, 5 steps. starting
from Xo = [2 4]T.
8. Apply the method of steepest descent to
13. In Prob. 12, could you start from [0
O]T and do 5 steps?
Summary of Chapter 22
953
121-25]
14. Show that the gradients in Prob. 13 are orthogonal. Give
a reason.
Xl
115-20 1 Graph or sketch the region in the first quadrant
of the X1x2-plane detelwined by the following inequalities.
15.
17.
Xl
+
2X1
+
3X2 ~
6
X2
~
4
-Xl
+
X2
~
0
XI
+
X2
~
4
16.
0.8X1
18.
2X2 ~
Xl -
Xl -
+
X2
2X1
-4
2xI
+
X2 ~
12
XI
+
X2 ~
8
Xl
+
X2 ~
2
3X2 ~
-12
~
15
f =
~
+
X2
f
~
+ x2
-XI
+
20.
X2 ~
5
X2 ~
3
2X1 -
2
Xl
+ X2
~
Xl
X2
20X2
subject to Xl
~
5.
+
~
X2
subject to
Xl
+
2X2 ~
+
+
40X2 ~
1800
(Machine hours),
20X2 ~
6300
(Labor).
25. Maximize the daily output in producing XI chairs by
a process PI and X2 chairs by a process P 2 subject to
3x 1 + 4X2 ~ 550 (machine hours), 5xI + 4X2 ~ 650
(labor).
--:!:.,: .'fI.=.:.,=:~"'=="==:._,=:':==:='
Unconstrained Optimization.
Linear Programming
In optimization problems we maximize or minimize an objective function.: = f(x)
depending on control variables Xl' ••• , X'" whose domain is either unrestricted
("unconstrained optimization," Sec. 22. I) or restricted by constraints in the fonn
of inequalities or equations or both ("constrained optimization," Sec. 22.2).
If the objective function is linear and the constraints are linear inequalities in
X10 ••• , -'""" then by introducing slack variables X m +l, . • . , Xn we can write the
optimization problem in normal form with the objective function given by
(I)
(where c m + 1
= ell
10.
4.
~
200X1
XI
=
LO.
+
4.
= 2X1 IOx2 subject to Xl - X2 ~ 4.
14. Xl + X2 ~ 9. -Xl + 3x2 ~ 15.
24. A factory produces two kinds of gaskets, G I • G 2 • with
net profit of $60 and $30. respectively. Maximize the
total daily profit subject to the constraints (Xj = number
of gaskets Gj produced per day)
2xI
40X1
19.
lOx.
X2 ~
6.
23. Minimize f
6
2x2 ~
+ X2
22. Maximize
-2
~
Maximize or minimize as indicated.
21. Maximize
= 0) and the constraints given by
(2)
Xl ~
o... '. x.,
~
O.
[n this case we can then apply the widely used simplex method (Sec. 22.3), a
systematic stepwise search through a very much reduced subset of all feasible
solutions. Section 22 4 shows how to overcome difficulties with this method
t
--\
... ~
23
---, C HAP T E R
Graphs.
Combinatorial Optimization
Graphs and digraphs (= directed graphs) have developed into powerful tools in areas,
such as electrical and civil engineering, communication networks, operations research,
computer science, economics, industrial management, and marketing. An essential factor
of this growth is the use of computers in large-scale optimization problems that can be
modeled by graphs and solved by algorithms provided by graph theory. This approach
yields models of general applicability and economic imporlance. II lies in the center of
combinatorial optimization, a term denoting optimization problems that are of
pronounced discrete or combinatorial structure.
This chapter gives an introduction to this wide area, which constitutes a shift of emphasis
away from differential equation~. eigenvalues, and so on, and is full of new ideas as well
as open problems-in connection, for instance, with efficient computer algorithms. The
classes of problems we shall consider include transportation of minimum cost or time,
best assignment of workers to jobs, most efficient use of communication networks, and
many others. Problems for these classes often form the core of larger and more involved
practical problems.
Prerequisite: none.
References and Answers
10
Problems: App. I Parl F, App. 2.
23.1 Graphs and Digraphs
Roughly, a graph consists of points, called vertices, and lines connecting them, called
edges. For example, these may be four cities and five highways connecting them, as in
Fig. 476. Or the points may represent some people, and we connect by an edge those who
do business with each other. Or the vertices may represent computers in a network and
the edges connections between them. Let us now give a fonnal definition .
./JLOOP
~ flsolated
\ ......------...
"
vertex
'-------""
Double edge
Fig. 476. Graph consisting of
4 vertices and 5 edges
954
Fig. 477. Isolated vertex, loop, double
edge. (Excluded by definition.)
SEC. 23.1
955
Graphs and Digraphs
DEFINITION
Graph
A graph G consists of two finite sets (sets having finitely many elements), a set V
of points, called vertices, and a set E of connecting lines, called edges, such that
each edge connects two vertices, called the endpoints of the edge. We write
G
=
(V, E).
Excluded are isolated vertices (vertices that are not endpoints of any edge), loops
(edges whose endpoints coincide), and lIlultiple edges (edges that have both
endpoints in common. See Fig. 477.
CAUTION! Our three exclusions are practical and widely accepted, but not uniformly.
For instance, some authors permit multiple edges and call graphs without them simple
graphs.
•
We denote vertices by letters, Lt, v, ... or VI' V 2 , .•• or simply by numbers 1,2, ...
(as in Fig. 476). We denote edges by el, e2, ... or by their two endpoints; for instance,
el = (1, 4), e2 = (1, 2) in Fig. 476.
An edge (Vi' V) is called incident with the vertex Vi (and conversely); similarly,
(Vi' Vj) is incident with Vj. The number of edges incident with a vertex V is called the
degree of v. Two vertices are called adjacent in G if they are connected by an edge in
G (that is, if they are the two endpoints of some edge in G).
We meet graphs in ditlerent fields under different names: as "networks" in electrical
engineering, "structures" in civil engineering, "molecular structures" in chemistry,
"organizational structures" in economics, "sociograms," "road maps," "telecommunication
networks," and so on.
Digraphs (Directed Graphs)
Nets of one-way streets, pipeline networks, sequences of jobs in construction work, flows
of computation in a computer, producer-consumer relations, and many other applications
suggest the idea of a "digraph" (= directed graph), in which each edge has a direction
(indicated by an arrow, as in Fig. 478).
Fig. 478.
DEFINITION
Digraph
Digraph (Directed Graph)
A digraph G = (V, E) is a graph in which each edge e
its "initial point" i to its "terminal point" j.
=
(i,j) has a direction from
Two edges connecting the same two points i, j are now permitted, provided they have
opposite directions, that is, they are (i,j) and (j, i). Example. (1,4) and (4, 1) in Fig. 478.
956
CHAP. 23
Graphs. Combinatorial Optimization
A subgraph or subdigraph of a given graph or digraph G = (V. E), respectively, is a
graph or digraph obtained by deleting some of the edges and vertices of G. letaining the
other edges of G (together with their pairs of endpoints). For instance, el' e3 (together
with the vertices I, 2. 4) form a subgraph in Fig. 476. and e3, e4, e5 (together with the
vertices l. 3. 4) fonn a subdigraph in Fig. 478.
Computer Representation of Graphs and Digraphs
Drawings of graphs are useful to people in explaining or illustrating specific situations.
Here one should be aware that a graph may be <;ketched in various ways; see Fig. 479.
For handling graphs and digraphs in computers. one uses matrices or lists as appropriate
data stmctures. as follows.
(a)
(b)
Fig. 479.
Different sketches of the same graph
Adjacency Matrix of a Graph G:
{/ij =
(c)
Matrix A =
{Io
[{lU]
with entries
if G has an edge (i, j),
else.
Thus £Iij = 1 if and only if two veItices i and j are adjacent in G. Here, by definition, no
vertex is considered to be adjacent to itself; thus, {/i; = O. A is symmetric, {/ij = {lji- (Why?)
The adjacency matrix of a graph is generally much smaller than the so-called illcidellce
matrix (see Probs. 21, 22) and is preferred over the latter if one decides to store a graph
in a computer in matrix form.
EXAMPLE 1
Adjacency Matrix of a Graph
Vertex
Vertex I
2
3
4
Adjacency Matrix of a Digraph G:
{/ij
=
{Io
2
[~
3
4
0
0
0
J
Matrix A = [aij] with entries
if G has a directed edge (i, j),
else.
This matrix A is not symmetric. (Why?)
•
SEC. 23.1
957
Graphs and Digraphs
E X AMP L E 2
Adjacency Matrix of a Digraph
To vertex
2
From vertex I
2
3
4
3
4
0
[i
0
0
0
0
U
;]
•
Lists. The vertex incidence list of a graph shows for each vertex the incident edges.
The edge incidence list shows for each edge its two endpuints. Similarly for a digraph;
in the vertex list, outgoing edges then get a minus sign, and in the edge list we now have
ordered pairs of vertices.
E X AMP L E 3
Vertex Incidence List and Edge Incidence List of a Graph
This graph is the same as in Example I. except for notation.
Vertex
Incident Edges
Edge
Endpoints
VI
el' e5
el
VI, V 2
V2
el' C2, e3
C2
V 2 , V3
V3
e2' C4
C3
V 2 , V4
V4
C3 • C4 , e5
e4
V 3• V4
e5
VI. V 4
•
"Sparse graphs" are graphs with few edges (far fewer than the maximum possible number
n(n - I )/2. where n is the number of vertices). For these graphs. matrices are not efficient.
Lists then have the advantage of requiring much less storage and being easier to handle;
they can be ordered, sorted, or manipulated in various other ways directly within the
computer. For instance, in tracing a "walk" (a connected sequence of edges with pairwise
common endpoints), one can easily go back and furth between the two lists just discussed,
instead of scanning a large column of a matrix for a single 1.
Computer science has deVeloped more refined lists, which, in addition to the actual
content, contain "pointers" indicating the preceding item or the next item to be scanned
or both items (in the case of a "walk": the preceding edge or the subsequent one). For
details, see Refs. [E 16] and LF7].
This section was devoted to basic concepts and notations needed throughuut this chapter,
in which we shall discuss some of the most impurtant classes of combinatorial optimization
problems. This will at the same time help us to become more and more familiar with
graphs and digraphs.
958
CHAP. 23
-.•.--..-- ..... -....
--~--
Graphs. Combinatorial Optimization
..
~.-.
........ j--. ....
- -.....-.
--
1. Sketch the graph consisting of the vertices and edges
of a square. Of a tetrahedron.
2. Worker WI can do jobs 11 and 13 , worker W2 job 14 ,
worker W3 jobs 12 and J 3 . Represent this by a graph.
3. Explain how the following may be regarded as graphs
or digraphs: flight connections between given cities:
memberships of some persons in some committees;
relations between chapters of a book: a tennis
tournament; a family tree.
4. How would you represent a net of one-way and two-way
streets by a digraph?
5. Give further examples of situations that could be
represented by a graph or digraph.
6. Find the adjacency matrix of the graph in Fig. 476.
Sketch the graph whose adjacency matrix is:
0
0
14.
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
16.
ADJACENCY MATRIX
Find the adjacency matrix of the graph or digraph.
8.
0
15.
7. When will the adjacency matrix of a graph be
symmetric? Of a digraph?
18-131
0
0
Sketch the digraph whose adjacency matrix is:
17. The matrix in Prob. 14.
10.
18. The matrix in Prob. 16.
19. (Complete graph) Show that a graph G with 11 vertices
can have at most 11(11 - 1)/2 edges, and G has exactly
n(1l - I )12 edges if G is complete, that is, if every pair
of vertices of G is joined by an edge. (Recall that loops
and multiple edges are excluded.)
20. In what case are all the off-diagonal entries of the
adjacency matrix of a graph G equal to I?
11.
Incidence Matrix of a Graph: Matrix B = [bjk ] with
entries
if vertex j is an endpoint of edge ek
otherwise.
Find the incidence matrix of:
21. The graph in Prob. 9.
12.
22. The graph in Prob. 8.
Incidence Matrix of a Digraph: Matrix B =
entries
a;jk] with
if edge ek leaves vertex j
if edge ek enters vertex j
13.
othelwise.
Find the incidence matrix of:
23. The digraph in Prob. II.
24. The digraph in Prob. 13.
25. Make a vertex incidence list ofthe digraph in Prob. 13.
SEC. 23.2
Shortest Path Problems.
23.2
959
Complexity
Shortest Path Problems.
Complexity
Beginning in this section, we shall discuss some of the most important classes of
optimization problems that concern graphs and digraphs as they arise in applications. Basic
ideas and algorithms will be explained and illustrated by small graphs, but you should
keep in mind that real-life problems may often involve many thousands or even millions
of vertices and edges (think of telephone networks, worldwide air travel, companies that
have offices and stores in all larger cities). Then reliable and efficient systematic methods
are an absolute necessity-solution by inspection or by trial and error would no longer
work, even if "'nearly optimal" solutions are acceptable.
We begin with shortest path problems, as they arise, for instance, in designing shortest
(or least expensive, or fastest) routes for a traveling salesman. for a cargo ship. etc. Let
us first explain what we mean by a path.
In a graph G = (V, E) we can walk from a vertex VI along some edges to some other
vertex vk- Here we can
(A) make no restrictions. or
(B) require that each edge of G be traversed at most once, or
(C) require that each vertex be visited at most once.
In case (A) we call this a walk. Thus a walk from
VI
to
Vk
is of the form
(1)
where some of these edges or vertices may be the same. In case (B), where each edge
may occur at most once, we call the walk a trail. Finally, in case (C), where each vertex
may occur at most once (and thus each edge automatically occurs at most once), we call
the trail a path.
We admit that a walk. trail. or path may end at the vertex it started from. in which case
we call it closed; then Vk = VI in (1).
A closed path is called a cycle. A cycle has at least three edges (because we do not
have double edges; see Sec. 23.1). Figure 480 illustrates all these concepts.
Walk, trail, path, cycle
Fig. 480.
14
11-
21
22-
3233-
2 is
34 4-
a walk (not a trail).
4 - 5 is a trail (not a path).
5 is a path (not a cycle).
1 is a cycle.
Shortest Path
To define the concept of a shortest path, we assume that G = (V, E) is a weighted graph,
that is, each edge (Vi, V) in G has a given weight or length lij > O. Then a shortest path
VI ----;. Vk (with fixed VI and Vk) is a path (1) such that the sum of the lengths of its edges
112
(/12 = length of
VI
to
Vk)'
+
123
+
134
+ ... +
Ik-I,K
(Vb V 2 ), etc.) is minimum (as small as possible among all paths from
Similarly, a longest path VI ----;. Vk is one for which that sum is maximum.
960
CHAP. 23
Graphs. Combinatorial Optimization
Shortest (and longest) path problems are among the most important optimization problems.
Here, "length" lij (often also called "cost" or "weight") can be an actual length measured
in miles or travel time or gasoline expenses, but it may also be something entirely different.
For instance, the "traveling salesman problem" requires the determination of a shortest
Hamiltonian i cycle in a graph, that is, a cycle that contains all the vertices of the graph.
As another example, by choosing the "most profitable" route VI --4 Vk, a salesman may
want to maximize "Llij , where lij is his expected commission minus his travel expenses
for going from town i to townj.
In an investment problem, i may be the day an investment is made,j the day it matures,
and lij the resulting profit, and one gets a graph by considering the various possibilities
of investing and reinvesting over a given period of time.
Shortest Path if All Edges Have Length I = 1
Obviously, if all edges have length I, then a shortest path VI - Vk is one that has the
smallest number of edges among all paths VI - Vk in a given graph G. For this problem
we discuss a BFS algorithm. BFS stands for Breadth First Search. This means that in
each step the algorithm visits all neighhnring (all adjacent) vertices of a vertex reached.
as opposed to a DFS algorithm (Depth First Search algorithm), which makes a long trail
(as in a maze). This widely used BFS algorithm is shown in Table 23.1.
We want to find a shortest path in G from a vertex s (start) to a vertex t (terminal). To
guarantee that there is a path from s to t, we make sure that G does not consist of separate
portions. Thus we assume that G is connected, that is, for any two vertices V and w there
is a path V _ w in G. (Recall that a vertex V is called adjacent to a vertex 11 if there is
an edge (u, v) in G.)
Table 23.1
Moore's BFS for Shortest Path (All Lengths One)
Proceedings of the Illfemationai Symposillm for Switching TIleory. Part II. pp. 285-292. Cambridge: Harvard
University Press. 1959.
ALGORITHM MOORb [G
=
(V, E), s, t]
This algorithm determines a shortest path in a connected graph G
s to a vertex t.
INPUT:
OUTPUT:
1.
2.
3.
4.
5.
= (V, E)
from a vertex
Connected graph G = (V, E), in which one vertex is denoted by sand
one by t, and each edge (i, j) has length lij = 1. Initially all vertices are
unlabeled.
A shortest path s ---+ t in G = (V. £)
Label s with O.
Set i = O.
Find all unlabeled vertices adjacent to a vertex labeled i.
Label the vertices just found with i + l.
If vertex t is labeled. then "backtracking" gives the shortest path
k (= label of t), k - 1, k - 2, ... , 0
OUTPUT k, k - 1, k - 2, ... , O. Stop
Else increase i by I. Go to Step 3.
End MOORE
lWILLIAM ROWAN HAMILTON (1805-1865), Irish mathematician, known for his work in dynamics
SEC. 23.2
Shortest Path Problems.
E X AMP L E 1
961
Complexity
Application of Moore's BFS Algorithm
Find a shortest path s
---+
t in the graph G shown in Fig. 481.
Solution.
Figure 481 shows the labels. The blne edges form a shortest path (length 4). There is another
shortest path s ---+ t. (Can yon find it?) Hence in the program we must introduce a rule that makes backtracking
unique because otherwise the computer would not know what to do next if at some step there is a choice (for
instance, in Fig. 481 when it got back to the vertex labeled 2) The following rule seems to be natural.
Backtracking rule.
Using the numbeIing of the vertices from I to II (not the labeling'). at each step, if a
vertex labeled i i, reached, take as the next ve11ex that with the smallest number (not label!) among all the
•
vertices labeled i - I .
2
Fig. 481.
Example 1, given graph and result of labeling
Complexity of an Algorithm
Complexity of Moore's algorithm. To find the vertices to be labeled 1, we have to scan
all edges incident with s. Next, when i = 1, we have to scan all edges incident with vertices
labeled I, etc. Hence each edge is scanned twice. These are 2m operations (Ill = number
of edges of G). This is a function c(m). Whether it is 2171 or 5111 + 3 or 12m is not so essential;
it is essential that c(m) is proportional to 111 (not nz 2 , for example); it is of the "order" 1II.
We write for any function alll + b simply Oem), for any function am 2 + bm + d simply
0(11/ 2 ), and so on; here, 0 suggests order. The underlying idea and practical aspect are
as follows.
lnjudging an algorithm, we are mostly interested in its behavior for very large problems
(large m in the present case), since these are going to determine the limits of the
applicability of the algorithm. Thus, the essential item is the fastest growing term (am 2
in alll 2 + bill + d, etc.) since it will overwhelm the others when III is large enough. Also,
a constant factor in this term is not very essential; for instance, the difference between
two algorithms of orders. say, 5111 2 and 8m 2 is generally not very essential and can be
made irrelevant by a modest increase in the speed of computers. However, it does make
a great practical difference whether an algorithm is of order 11/ or m 2 or of a still higher
power lII P • And the biggest difference occurs between these "polynomial orders" and
"exponential orders," such as 2'11.
For instance, on a computer that does I 09 operations per second. a problem of size
m = 50 will take 0.3 second with an algorithm that requires /715 operation~, but 13 days
with an algorithm that requires 2m operations. But this is not our only reason for regarding
polynomial orders as good and exponential orders as bad. Another reason is the gain in
using afaster computer. For example let two algorithms be Oem) and 0(111 2 ). Then, since
1000 = 31.62 , an increase in speed by a factor 1000 has the effect that per hour we can
do problems 1000 and 31.6 times as big, respectively. But since 1000 = 2 9 .97 • with an
algorithm that is 0(2"'), all we gain is a relatively modest increase of 10 in problem size
because 2 9 . 97 • 2 m = 2>n+9.97.
962
CHAP. 23
Graphs. Combinatorial Optimization
The symbol 0 is quite practical and commonly used whenever the order of growth is
essential. but not the specific form of a function. Thus if a function g(m) is of the form
gem)
= kh(m) + more slowly growing terms
(k
*" 0, constant),
we say that g(m) is of the order h(111) and write
g(m)
=
O(lZ(111)).
For instance,
am
+
b
=
5 . 2m
0(111).
+
3m 2 = 0(21n).
We want an algorithm .iI to be "efficient." that is. "good" with respect to
(i) Time (number C.,l(111) of computer operations), or
(ii) Space (storage needed in the internal memory)
or both. Here c. J suggests "complexity" of .iI. Two popular choices for
cjm) = longest time 9'1 takes for a problem of size
(Worst case)
(Average case)
C~'I
are
/11,
c.cim) = average time sl takes for a problem of size
111.
In problems on graphs, the "size" will often be 11l (number of edges) or 11 (number of
vertices). For our present simple algorithm, cJm) = 2m in both cases.
For a "good" algorithm.iI, we want that c.jm) does not grow too fast. Accordingly,
we call .cJ1 efficient if C,,'1(111) = O(mk) for some integer k ~ 0; that is, C". I may contain
only powers of m (or functions that grow even more slowly, such as In m), but no
exponential functions. Furthermore, we call sJ polynomially bounded if .9l is efficient
when we choose the "worst case" C,,«111). These conventional concepts have intuitive
appeal. as our discussion shows.
Complexity should be investigated for every algorithm, so that one can also compare
different algorithms for the same task. This may often exceed the level in this chapter;
accordingly, we shall confine ourselves to a few occasional comments in this direction.
[1=6]
SHORTEST PATH
5.
Find a shortest path P: s --> t and its length by Moore's
BFS algorithm; si--erch the graph with the labels and indicate
P by heavier lines (as in Fig. 481).
1.
2't~\
\
·V~
. . . . . /"-J
s
4. /
"':::) ' " 0
~~-''b
t/
6.
S
/\_1-1- . . . .
V\/_. /"
__
t
/\/~(
!
sQ\
•
•
7. (Nonuniqueness) A shonest path s --> t for given sand
t need not be unique. Illustrate this by finding another
shortest path s --> t in Example I in the text.
8. (Maximum length) If P is a shortest path between any
two vertices in a graph with n vertices, how many edges
can P at most have'? In a complete graph (with all edges
of length 1)7 Give a reason.
9. (Moore's algorithm) Show that if a vertex v has label
.A(v) = k, then there is a path s --> v of length k.
SEC. 23.3
Bellman's Principle.
10. Call the length of a shortest path s ~ v the distance
of v from s. Show that if v has distance /, it has label
A(v)
=
963
Dijkstra's Algorithm
16. Find 4 different closed Euler trails in Fig. 484.
2
I.
/\/\
11. (Hamiltonian cycle) Find and sketch a Hamiltonian
cycle in the graph of Prob. 3.
12. Find and sketch a Hamiltonian cycle in the graph of a
dodecahedron. which has 12 pentagonal faces and
20 vertices (Fig. 482). This is a problem Hamilton
himself considered.
Fig. 482.
Problem 12
13. Find and sketch a Hamiltonian cycle
Sec. 23.1.
In
Fig. 479.
14. (Euler graph) An EllIeI' graph G is a graph that has a
clo:-.ed Euler trail. An Euler trail is a trail that contains
every edge of G exactly once. Which subgraph with
four edges of the graph in Example I, Sec. 23.1. is an
Euler graph?
15. Is the graph in Fig. 483 an Euler graph? (Give a reason.)
4
3
Fig. 484.
5
Problem 16
17. The postman problem is the problem of finding a
closed walk W: s ~ s (s the post office) in a graph G
with edges (i,j) of length lij > 0 such that every edge
of G is traversed at least once and the length of W is
minimum. Find a solution for the graph in Fig. 483 by
inspection. (The problem is also called the Chinese
postman problem since it was published in the journal
Chinese MathenlOtic.I' 1 (1962),273-277.)
18. Show that the length of a shortest postman trail is the
same for every starting verteX.
19. (Order) Show that 0(111 3 )
kO(111 P )
=
+
0(111 3 ) =
0(111 3 ) and
O(m P )'
20. Show that ~ = 0(111), O.02e m
+
100m2
= O(e m ).
21. If we switch from one computer to another that is 100
times as fast. what is our gain in problem size per hour
in the use of an algorithm that is 0(111), 0(111 2 ). 0(111 5 ).
O(e"")?
4
3}-------{
Fig. 483.
23.3
Problems 15, 17
Bellman's Principle.
22. CAS PROBLEM. Moore's Algorithm. Write a
computer program for the algorithm in Table 23.1. Test
the program with the graph in Example I. Apply it to
Probs. 1-3 and to some graphs of your own choice.
Dijkstra's Algorithm
We continue our discussion of the shorrest path problem in a graph G. The last section
concerned the special case that all edges had length 1. But in most applications the edges
(i, j) will have any lengths lij > 0, and we now turn to this general case, which is of
greater practical importance. We write lij = :x; for any edge (i,j) that does not exist in G
(setting 'Xl + a = :x; for any number a, as usual).
We consider the problem of finding shortest paths from a given veltex. denoted by I
and called the origin, to all other vel1ices 2. 3 ..... Il of C. We let Lj denote the length
of a shortest path p/ I ~ j in G.
THEOREM 1
Bellman's Minimality Principle or Optimality Principle2
{l Pj
: I ~ j is a ShOrTeST path from 1 TO j ill G lllld (i. j) is tlze llist edge of Pj
(Fig. 485), Then Pi: 1 ~ i [obtained by droppi1lg (i, j) from Pj ] is a sh017est path
I~i.
2RICHARD BELLMAN (1920--1984). American mathematician, known for his WOIX in dynamic programming.
964
CHAP. 23
Graphs. Combinatorial Optimization
P.I
~
__________~A~__________~\
~
"'/-v~
Fig. 485.
PROOF
j
Paths P and Pi in Bellman's minimality principle
Suppose that the conclusion is false. Then there is a path P;"': I _ i that is shorter than
Pi' Hence if we now add U. j) to Pi*, we get a path I _ j that is shorter than Pj' This
•
contradicts our assumption that Pj is "hortest.
From Bellman's principle we can derive basic equations as follows. For fixed j we may
obtain various paths 1 _ j by taking shortest paths Pi for vmious i for which there is in
G an edge (i,j), and add U,.i) to the conesponding Pi' These paths obviously have lengths
Li + lij (Li = length of Pi)' We can now take the minimum over i. that is. pick an i for
which Li + lij is smallest. By the Bellman principle. this gives a shortest path I ~ j. It
has the length
Ll = 0
(1)
L·:J = min
(L·
i*j
1,
+
I)
1,J'
j = 2, ... ,
fl.
These are the Bellman equations. Since Iii = 0 by definition, instead of mini'1'j we can
simply write mini' These equations suggest the idea of one of the best-known algorithms
for the shortest path problem, as follows.
Dijkstra's Algorithm for Shortest Paths
Dijkstra's3 algorithm is shown in Table 23.2, where a connected graph G is a graph in
which for any two vertices v and 1I' in G there is a path v _ w. The algorithm is a labeling
procedure. At each stage of the computation. each vertex v gers a label, either
(PU a permallent label
=
length Lv of a shortest path 1 ----7
V
or
(TL) a temporary label = upper bound
Lv for the length of a shortest path
1 ~ v.
We denote by '!P;£ and 2J;£ the sets of vertices with a permanent label and with a temporary
label, respectively. The algorithm has an initial step in which vertex I gets the permanent
label Ll = 0 and the other vertices get temporary labels, and then the algorithm alternates
between Steps 2 and 3. In Step 2 the idea is to pick k "minimally." In Step 3 the idea is
that the upper bounds will in general improve (decrease) and must be updated accordingly.
Namely, the new temporary label I j of vertex j will be the old one if there is no
improvement or it will be Lk + Ikj if there is.
3
EDSGER WYBE DIJKSTRA (1930-2002), Dutch computer scientist. 1972 recipient of the ACM TurinG"
Award. His algorithm appeared in Numerische Mathematik J (1959),269-271.
b
SEC. 23.3
Bellman's Principle.
965
Dijkstra's Algorithm
Table 23.2
Dijkstra's Algorithm for Shortest Paths
ALGORITHM DIJKSTRA [G = (V, E), V = {L ... , 17}, lij for all (i, j) in E]
Given a connected graph G = (V, i',") with vertices 1, ... , n and edges (i, j) having
lengths liJ > 0, this algorithm determines the lengths of shortest paths fi'om vertex I to
the vertices 2, ... , n.
INPUT: Number of vertices 17, edges (i, j), and lengths lij
OUTPUT: Lengths Lj of sh011est paths I
-?
j, j = 2, ... , n
1. lnitiul step
Vertex I gets PL: LI = O.
Vertexj (= 2, .. ',1/) gets TL: ~ = Ilj (=
Set qp5£ = {I}, 'j5£ = {2, 3, ... , n}.
C/J
if there is no edge (1,j) in G).
2. Fixing a permanent label
Find a k in 'j5£ for which Lk is miminum, set Lk = L k . Take the smallest k if
there are several. Delete k from 'j5£ and include it in f!P5£.
If 'j5£ = 0 (that is, 'j5£ is empty) then
OUTPUT L 2 ,
Ln. Stop
••. ,
Else continue (that is, go to Step 3).
3. Updating tempnral}" labels
For all j in 'j5£, set L j ~ mink {L j , Lk
Lk + I kj as your new L j ).
+
lk.i} (that is, take the smaller of L j and
Go to Step 2.
End DIJKSTRA
E X AMP L E 1
Application of Dijkstra's Algorithm
Applying Dijkstra's algorithm to the graph in Fig. 4b6a, find shortest paths from vertex I to vertices 2, 3, 4.
Solution.
=
2. L3 =
3. L2 =
L4 =
2. L2 =
1.
Ll
We list the steps and computations.
0,
L2 = 8, La = 5, L4 = 7,
[L 2 .L3 , L4 } = 5. k = 3.
min
min {8,L3 + /32J
min [7, L3
min
3.
L4
=
2.
4
= 7.
{L2' L4 ) =
min {7,L2
=
+ 134} =
+
min [6,7)
/24)
=
=
6
= 2.
=
7
min {8, 5 + IJ
min [7, x}
=
=
6, k
min [7,6
+
'!J5f. = [2,3, 4}
'!J5f. = {2, 4}
'If':£
'!J5f.
=
[4}
'!J:£
=
0.
7
2}
=
{I, 2, 3],
'd''£= [1,2,3,4},
k = 4
Figure 486b shows the resulting shortest paths. of lengths
(a) Given graph G
Lz =
6. L3
= 5.4 =
7.
(b) Shortest paths in G
Fig. 486.
Complexity.
qp5f. = {I},
'!P5f.={1.3},
Dijkstra's algorithm is 0(n2).
Example 1
•
CHAP. 23
966
PROOF
Graphs. Combinatorial Optimization
Step 2 requires comparison of elements, first II - 2, the next time 11 - 3, etc., a total
of (11 - 2)(11 - 1)/2. Step 3 requires the same number of comparisons, a total of
(11 - 2)(11 - I )/2, as well m; additions, fIrst Il - 2, the next time 11 - 3, etc., dgain a [otal of
2
(ll - 2)(11 - I )/2. Hence the total number of operations is 3(11 - 2)(11 - 1)/2 = O(1l ) . •
1. The net of roads in Fig. 487 connecting four villages
is to be reduced to minimum length. but so that one
can still reach every village from every other village.
Which of the roads should be retained? Find the
solution (a) by inspection. (b) by Dijkstra's
algorithm.
·t
5.
6.
18
Fig. 487.
Problem 1
2-=7]
DIJKSTRA'S ALGORITHM
Find shortest paths for the following graphs.
2.
3.
8. Show that in Dijkstra's algorithm, for L" there is a path
P: I ~ k of length L k .
9. Show that in Dijkstra's algorithm. at each instant the
demand on storage is light (data for less than II edges)
10. CAS PROBLEM. Dijkstra's Algorithm. Write a
program and apply it to Probs. 2--4.
23.4
Shortest Spanning Trees:
Greedy Algorithm
So far we have discussed shortest path problems. We now turn to a particularly important
kind of graph. called a tree. along with related optimization problems that arise quite often
in practice.
By definition. a tree T is a graph that is connected and has no cycles. "Connected"
was defined in Sec. 23.3; it means that there is a path fmm any vertex in T to any other
veltex in T. A cycle is a path s ~ t of at least three edges that is closed (t = s); see also
Sec. 23.2. Figure 488a shows an example.
CAUTION!
The terminology varies; cycles are sometimes also called circuits.
SEC. 23.4
967
Shortest Spanning Trees: Greedy Algorithm
A spanning tree T in a given connected graph G = (V, E) is a tree containing all the
vertices of G. See Fig. 488b. Such a tree has 17 - 1 edges. (Proof?)
A shortest spanning tree T in a connected graph G (whose edges (i, j) have lengths
lij> 0) is a spanning tree for which 'Llij (sum over all edges of 7) is minimum compared
to 'Llij for any other spanning tree in G.
17
Trees are among the most important types of graphs, and they occur in various
applications. Familiat examples ate family trees and organization charts. Trees can be used
to exhibit, organize, or analyze electrical networks, producer-consumer and other business
relations, infonnation in database systems, syntactic structure of computer programs, etc.
We mention a few specific applications that need no lengthy additional explanations.
The set of shortest paths from vertex I to the vertices 2..... 17 in the last section forms
a spanning tree.
Railway lines connecting a number of cities (the vertices) can be set up in the form of
a spanning tree, the "length" of a line (edge) being the construction cost, and one wants
to minimize the total construction cost. Similarly for bus lines, where "length" may be
the average annual operating cost. Or for steamship lines (freight lines), where "length"
may be profit and the goal is the maximization of total profit. Or in a network of telephone
lines between some cities, a shortest spanning tree may simply represent a selection of
lines that connect all the cities at minimal cost. In addition to these examples we could
mention others from distribution networks. and so on.
We shall now discuss a simple algorithm for the problem of finding a shortest spanning
tree. This algorithm (Table 23.3) is patticularly suitable for spatse graphs (graphs with
very few edges: see Sec. 23.1).
Table 23.3
Kruskal's Greedy Algorithm for Shortest Spanning Trees
Proceedings of the American Mathematical Society 7 (1956), 48-50.
ALGORITHM KRUSKAL [G
=
(V, E), lij for all (i,j) in EJ
Given a connected graph G = (V, E) with edges (i.j) having length lij > O. the algorithm
detelmines a shortest spanning tree Tin G.
INPUT: Edges (i, j) of G and their lengths
lij
OUTPUT: ShOltest spanning tree T in G
1. Order the edges of G in ascending order of length.
2. Choose them in this order as edges of T, rejecting an edge only if it forms a
cycle with edges already chosen.
If 17 - I edges have been chosen. then
OUTPUT T (= the set of edges chosen). Stop
EndKRUSKAL
CaJ A cycle
Fig. 488.
(b) A spanning tree
Example of (a) a cycle, (b) a spanning tree in a graph
968
E X AMP L E 1
CHAP. 23
Graphs. Combinatorial Optimization
Application of Kruskal's Algorithm
Using Kmskars algorithm. we shall determine a shortest spanning tree in the gmph in Fig. 489.
Fig. 489.
Graph in Example 1
Solution.
See Table 23.4. In some of the intennediate stages the edges chosen form a disconnected gmph
(see Fig. 490); this is typical. We stop after n - I = 5 choices since a spanning tree has II - I edges. In our
problem the edges chosen are in the upper part of the list. This is typical of problems of any ~ize: in general,
•
edges farther down in the list have a smaller chance of being chosen.
Table 23.4
Edge
Solution in Example 1
Length
(3,6)
(I. 2)
(1,3)
(4.5)
(2.3)
(3,4)
(5.6)
(2,4)
2
4
6
7
8
Choice
1st
2nd
3rd
4th
Reject
5th
9
11
The efficiency of Kruskars method is greatly increased by
Double Labeling of Vertices.
Each vertex i carries
1I
ri
= Root of the subtree to which
Pi
= Predecessor of i
= 0 for roots.
Pi
double label (ri, Pi), where
i belongs,
ill its subtree,
This simplifies
Rejecting. If (i. j) is next in the list to be considered, reject (i. j) if ri = ') (that is. i and
j are in the same subtree. so that they are already joined by edges and (i. j) would thus
create a cycle). If ri i= I} include (i, j) ill T.
If there are several choices for ri, choose the smallest. If subtrees merge (become a
single tree), retain the smallest root as the root of the new subtree.
For Example I the double-label list is shown in Table 23.5. In storing it, at each instant
one may retain only the latest double label. We show all double labels in order to exhibit
the proces~ in all its stages. Labels that remain unchanged are nO[ listed again. Underscored
are the two I' s that are the common root of vertices 2 and 3, the reason for rejecting the
edge (2, 3). By reading for each vertex the latest label we can read from this list that I is
the vertex we have chosen as a root and the tree is as shown in the last part of Fig. 490.
SEC. 23.4
969
Shortest Spanning Trees: Greedy Algorithm
1
2
--"'I
3~
f
6
Second
First
Third
Fig. 490.
4
3/
Fourth
Choice process in Example 1
"
Fifth
This is made possible by the predecessor label that each vertex carries. Also, for accepting
or rejecting an edge we have to make only one comparison (the roots of the two endpoints
of the edge).
Ordering is the more expensive part of the algorithm. It is a standard process in data
processing for which various methods have been suggested (see Sorting in Ref. [E25]
listed in App. 1). For a complete list of 111 edges, an algorithm would be Oem log211/),
but since the 17 - 1 edges of the tree are most likely to be found earlier, by inspecting
the q « 1/1) topmost edges, for such a list of q edges one would have
O(q log2 /11).
Table 23.5
List of Double Labels in Example 1
Choice 1
(3,6)
Vertex
Choice 2
n,2)
Choice 3
(1,3)
Choice 4
(4,5)
Choice 5
(3,4)
l4,0)
(4,4)
n, 3)
(1,0)
2
3
4
5
6
11-61
(1, 1)
(3,0)
(3, 3)
KRUSKAL'S ALGORITHM
Find a shortest spanning tree by Kruskal' s algorithm.
1.
(1, 1)
2.
(1,3)
3.
(1,4)
CHAP. 23
970
Graphs. Combinatorial Optimization
8. Design an algorithm for obtaining longest spanning
trees.
9. Apply the algorithm in Prob. 8 to the graph in Example
I. Compare with the result in Example I.
10. To get a minimum spanning tree, instead of adding
shortest edges, one could think of deleting longest
edges. For what graph5 would this be feasible?
Describe an algorithm for this.
11. Apply the method suggested in Prob. IO to the graph
in Example 1. Do you get the same tree?
7. CAS PROBLEM. Kruskal's Algorithm. Write a
corresponding program. (Sorting is discussed in Ref.
[E25] listed in App. I.)
Chicago
Dallas
Denver
Los Angeles
New York
Dallas
Denver
Los Angele<;
New York
Washington, DC
800
900
650
1800
700
1350
1650
2500
650
1200
1500
2350
200
13. (Forest) A (not necessarily connected) graph without
cycles is called a forest. Give typical examples of
applications in which graphs occur that are forests or
trees.
I]4-20 I
GENERAL PROPERTIES OF TREES
Prove:
14. (l:niqueness) The path connecting any two vertice~ 1I
and u in a tree is unique.
15. If in a graph any two vertices are connected by a unique
path, the graph is a tree.
23.5
12. Find a shortest spanning tree in the complete graph of
all possible 15 connections between the six cities given
(distances by airplane. in miles. rounded). Can you
think of a practical application of the result?
1300
850
I
16. If a graph has no cycles, it must have at least 2 vertices
of degree 1 (definition in Sec. 23.\).
17. A tree with exactly two vertices of degree 1 must be a
path.
18. A tree with
induction.)
11
vertices has
11 -
I edges. (Proof by
19. If two vertices in a tree are joined by a new edge. a
cycle is formed.
20. A graph with 11 vertices is a tree if and only if it has
11 1 edges and has no cycles.
Shortest Spanning Trees:
Prim's Algorithm
Prim's algorithm shown in Table 23.6 is another popular algorithm for the shortest
spanning tree problem (see Sec. 23.4). This algorithm avoids ordering edges and gives a
tree T at each stage. a property that Kruskal's algorithm in the last section did not have
(look back at Fig. 490 if you did not notice it).
In Plim's algorithm, starting from any ~ingle vertex, which we call I, we "grow" the
tree T by adding edges to it, one at a time, according to some rule (in Table 23.6) until
T finally becomes a spanning tree, which is shortest.
We denote by U the set of vertices of the growing tree T and by S the set of its edges.
Thus, initially U = {I} and S = 0; at the end, U = V. the vertex set of the given graph
G = (V. E), whose edges (i, j) have length lij > 0, as before.
SEC. 23.5
Shortest Spanning Trees:
971
Prim's Algorithm
Thus at the beginning (Step 1) the labels
2.... ,
of the vertices
are the lengths of the edges connecting them to vertex I (or
00
11
if there is no such edge in
G). And we pick (Step 2) the shortest of these as the first edge of the growing tree T and
include its other endj in U (choosing the smallestj if there are several, to make the process
unique). Updating labels in Step 3 (at this stage and at any later stage) concerns each
vertex k not yet in U. Vertex k has label Ak = li(k),k from before. If Ijk < Ak , this means
that k is closer to the new member j just included in U than k is to its old "closest neighbor"
i(k) in U. Then we update the label of k, replacing Ak = li(k).k by Ak = Ijk and setting
i(k) = j. If. however, Ijk :;;:: Ak (the old label of k), we don't touch the old label. Thus the
label Ak always identifies the closest neighbor of k in U, and this is updated in Step 3 as
U and the tree T grow. From the final labels we can backtrack the final tree, and from
their numeric values we compute the total length (sum of the lengths of the edges) of this
tree.
Table 23.6
Prim's Algorithm for Shortest Spanning Trees
Bell System Technical Joltl1l1l136 (1957). 1389-I·WL
For an improved version of the algorithm. see Cheriton and Tmjan. SIAM Jolt,."al on COmplltlition 5
(1976).724-7-12.
ALGORITHM PRIM [G
=
(V, E), V
= {I, ... , 11},
lij for all (i, j) in E]
Given a connected graph G = (V, E) with vertices 1,2, ... ,11 and edges (i,j) having
length lij > O. this algorithm dete1l11ine~ a shortest spanning tree Tin G and its length
L(n
INPUT: n. edges (i. j) of G and their lengths lij
OUTPUT: Edge set 5 of a shortest spanning tree T in G: L(T)
[Initially, alll'ertices are lInlabeled.]
1. Initial step
Set i(k) = I, U = {I}, 5
Label vertex k (= 2, ...
=
0.
,11)
with Ak = Ii/e 1=
:G
if G has no edge
n. k)j.
2. Addition of an edge to the cree T
Let Aj be the smallest Ale for vertex k not in U. Include veltex j in U and edge
(i(j), j) in 5.
If U = V then compute
L(T) = "'2:/ij (sum over all edges in 5)
OUTPUT 5. UT). Stop
[5 is the edge set of a shortest spanning tree T ill G.]
Else continue (that is. go to Step 3).
3. Label updating
For every k not in U. if Ijle < Ak • then set Ale = lile and i(k) = j.
Go to Step 2.
End PRIM
CHAP. 23
972
Graphs. Combinatorial Optimization
Fig. 491.
E X AMP L E 1
Graph in Example 1
Application of Prim's Algorithm
Find a shortest spanmng tree in the graph
can <:ompare).
In
Fig. 4Yl (which is the
~ame
as in Example I. Sec. 23.4, so that we
Solutio1/. The steps are a~ follows.
1.
irk) = I, U =
IlJ, S
=
0, initial labels see Table 23.7.
2. A2 = 112 = 2 is smallest, U = 11. 2J. S = {(I, 2)/
3. Update labels as shown in Table 23.7. column (I).
2. A3 = 113 = 4 is smallest. U = II. 2. 3}. S = {(I, 2), (I. 3»)
3. Update labels
a~
shown in Table 23.7. column (Il).
2. A6 = 136 = I is smallest. U = II. 2. 3. 6J. S = 1(1. 21. (I. 3). (3. 6)J
3. Update labels as shown in Table 23.7, column (III).
2. A4 = 134 = 8 is smallest, U = II, 2, 3, 4, 6J, S = 10,2), (I. 3), (3. 4), (3, 6»)
3. Update labels
a~
shown in Table 23.7, column (IV).
2. A5 = 145 = 6 is smallest. U = V. S
=
(I, 2). (I. 3). (3. 4). (3. 6). (4, 5). Stop.
The tree is the same as in Example I. Sec. 23.4. Its length is 21. You will find it interesting to compare the
•
growth process of the present tree with that in Sec. 23.4.
Table 23.7
Vertex
2
3
4
11-71
Labeling of Vertices in Example 1
Relabeling
Initial
Label
112
113
= 2
=4
(ll)
(I)
1 13
= 4
:x;
124 =
5
:x;
x
6
:x;
x
11
PRIM'S ALGORITHM
Find a sh0l1est spanning tree by Prim's algorithm. Sketch it.
1. For the graph in Prob. I, Sec. 23.4
2. For the graph in Prob. 2. Sec. 23.4
3. For the graph in Prob. 4, Sec. 23.4
4.
6.
(Ill)
(IV)
SEC. 23.6
973
Flows in Networks
7.
8. (Complexity) Show that Prim's algorithm has
9.
10.
11.
12.
complexity 0(n2).
How does Prim's algorithm prevent the formation of
cycles as one grows T?
For a complete graph (or one that is almost complete),
if our dara is an 11 X 11 distance table (as in Prob. 12,
Sec. 23.4). sho'" that the present algorithm [which is
2
0(11 )] cannot easily be replaced by an algorithm of
order less than 0(/1 2 ).
In what case will Prim's algorithm give S = E as the
final result?
TEAM PROJECT. Center of a Graph and Related
Concepts. (a) Distance, eccentricity. Call the length
of a shortest path u ~ v in a graph C = (V. E) the
distance d(u, v) from II to v. For fixed u, call the
greatest £1(11. u) as u ranges over V the ecce1ltricity E(II)
of u. Find the eccentricity of vertices I, 2, 3 in the
graph in Prob. 7.
23.6
(b) Diameter, radius, center. The diameter d(C) of
a graph C = (V, E) is the maximum of li(li. u) as u and
u vary over V. and the radius r(C) is the smallest
eccentricity E(V) of the vertices v. A vertex v with
E(V) = r(C) is called a ce1ltral rertex. The set of all
central vertices is called the center of C. Find d(C),
r(C) and the center of the graph in Prob. 7.
(c) What are the diameter, radius, and center of the
spanning tree in Example I?
(d) Explain how the idea of a center can be used in
setting up an emergency service facility on a
transportation network. In setting up a fire station. a
shopping center. How would you generalize the
concepts in the case of two or more such facilities?
(e) Show that a tree T whose edges all have length I
has center consisting of either one vertex or two
adjacent vel1ices.
<0 Set up an algorithm of complexity 0(11) for finding
the center of a tree T.
13. What would the result be if you applied Prim's
algorithm to a graph that is not connected?
14. CAS PROBLEM. Prim's Algorithm. Write a
program and apply it to Probs. 4--6.
Flows in Networks
After shortest path problems and problems for trees. as a third large area in combinatorial
optimization we discuss flow problems in networks (electrical, water, communication,
traffic, business connections, etc.), turning from graphs to digraphs (directed graphs; see
Sec. 23.1).
By definition, a network is a digraph G = (V, E) in which each edge (i,j) has assigned
> 0 = maximum possible flow along (i, j)], and at one vertex, s,
called the source, a flow is produced that flows along the edges of the digraph G to another
vertex, t, called the target or sink, where the flow disappears.
to it a capacity Cij
r
In applications, this may be the flow of electricity in wires, of water in pipes. of cars
on roads, of people in a public transportation system. of goods from a producer to
consumers, of e-mail from senders to recipients over the Internet, and so on.
We denote the flow along a (directed!) edge (i, j) by fij and impose two conditions:
1. For each edge (i, j) in G the flow does not exceed the capacity
(1)
Cij,
('"Edge condition").
2. For each vertex i, not s or t,
Inflow = Outflow
("Vertex condition," "Kirchhoff's law");
974
CHAP. 23
Graphs. Combinatorial Optimization
in a formula,
0 if vertex i
k
{
j
lnnuw
s. i
=1=
t.
- f at the source s,
=
(2)
=1=
f at the target (sink)
t,
where f is the total flow (and at s the inflow is zero. whereas at t the outflow is zero).
Figure 492 illustrates the notation (for some hypothetical figures).
Fig. 492.
Notation in (2): inflow and outflow for a vertex i (not 5 or t)
Paths
By a path
of edges
VI ~ Vk
from a ve11ex
VI
to a ve11ex
Vk
in a digraph G we mean a sequence
regardless of their directiolls ill G, that forms a path as in a graph (see Sec. 23.2). Hence
when we travel along this path from VI to Vk we may traverse some edge ill its given
direction-then we call it a forward edge of our path-or opposite to its given directionthen we call it a backward edge of our path. In other words. our path consists of oneway streets. and forward edges (backward edges) are those that we travel in the right
direction (in the wrong direction). Figure 493 shows a forward edge (u. v) and a backward
edge (w. v) of a path VI ~ Vk'
CAUTION! Each edge in a network has a given direction, which we COllnot change.
Accordingly, if (u, v) is a forward edge in a path VI ~ Vk, then (u, v) can become a backward
edge only in another path Xl ~ Xj in which it is an edge and is traversed in the opposite
direction as one goes from Xl to.\); see Fig. 494. Keep this in mind. to avoid misunderstandings.
Fig. 493. Forward edge (u. v) and
backward edge (w. v) of a path v, ~
Vk
Fig. 494. Edge (u, v) as forward edge in the path
and as backward edge in the path X, ~ Xj
V, ~ Vk
Flow Augmenting Paths
Our goal will be to maximize thejlow from the sourCe s to the target t of a given network.
We shall do this by developing mcthods for increasing an existing flow (including the
special case in which the latter is zero). The idea then is to find a path P: s ~ t all of
whose edges are not fully used, so that we can push additional flow through P. This
suggests the following concept.
SEC. 23.6
975
Flows in Networks
DEFINITION
Flow Augmenting Path
A flow augmenting path in a network with a given flow
path P: s ~ t such that
(i) no forward edge is used to capacity; thus
(ii) no backward edge has flow 0; thus
EXAMPLE 1
Iij
Iij
<
Iij
Cij
on each edge (i, j) is a
for these;
> 0 for these.
Flow Augmenting Paths
Find flow augmenting paths in the network in Fig. 495, where the first number is the capacity and the second
number a given flow.
Fig. 495. Network in Example 1
First number = Capacity, Second number = Given flow
Solution. In practical problems. networks are large and one needs a sy.•tematic method for augmenting
flows, wllich we discllss ill tile next sectioll. In our small network, which should help to illustrate and clarify
the concepts and ideas, we can find flow augmenting paths by inspection and augment the existing flow f = 9
in Fig. 495. (The outtlow from s is 5 + 4 = 9, which equals the inflow 6 + 3 into t.)
We use the notation
for forward edges
llij =
hj
for backward
II = min ti ij
edge~
taken over all edges of a path.
From Fig. 495 we see that a flow augmenting path PI: s --> t is Pt= 1 - 2 - 3 - 6 (Fig. 496). with
j.I2 = 20 - 5 = 15. etc .. and j. = 3. Hence we can use PI to increase the given flow 9 to f = 9 + 3 = 12.
All three edges of PI are forward edges. We augment the flow by 3. Then the flow in each of the edges of PI
is increased by 3. so that we now have .fI2 = 8 (instead of 5), f23 = 11 (instead of 8), and h6 = 9 (instead
of 6). Edge (2. 3) is now used to capacity. The flow in the other edges remains as before.
We shall now try to increase the flow in thi~ network in Fig. 495 beyond f = 12.
There is another flow augmenting path P 2 : s --> t. namely. P 2 : J - 4 - 5 - 3 - 6 (Fig. 496). [t shows how a
backward edge comes in and how it is handled. Edge (3. 5) is a backward edge. It has now 2, so that tl.35 = 2.
We compute tl.14 = 10 - 4 = 6. etc. (Fig. 496) and .i = 2. Hence we can use P 2 for another augmentation to
get f = 12 + 2 = 14. The new flow is shown in Fig. 497. No further augmentation is possible. We shall confirm
later that f = 14 is maximum.
•
"'23 =
3
)-------i!~3
r
~6"'4
s~
"'35=2
@t
~4~5
""45 =
Fig. 496.
3
Flow augmenting paths in Example 1
CHAP. 23
976
Graphs. Combinatorial Optimization
Cut Sets
A "cut set" is a set of edges in a network. The underlying idea is simple and natural. If
we want to find out what i~ flowing from s to t in a network, we may cut the network
somewhere between sand t (Fig. 497 shows an example) and see what is t10wing in the
edges hit by the cut. because any flow from s to t must sometimes pass through some of
these edges. These form what is called a cut set. [In Fig. 497, the cut set consists of the
edges (2, 3), (5, 2), (4, 5).] We denote this cut set by (S, T). Here S is the set of vertices
on that side of the cut on which s lies (S = {s, 2, 4} for the cur in Fig. 497) and T is the
set of the other vertices (T = {3, 5, t} in Fig. 497). We say that a cut "partitiolls" the
vertex set V into two parts Sand T. Obviously, the corresponding cut set (S, T) consists
of all the edges in the network with one end in S and the other end in T.
Fig. 497.
Maximum flow in Example 1
By definition, the capacity cap (S, T) of a cut set (S, T) is the sum of the capacities of all
forward edges in (S, T) (forward edges only!), that is, the edges that are directed from S to T,
(3)
cap (S, T)
=
[sum over the forward edges of (S, T)].
LCij
Thus, cap (S, T) = II + 7 = 18 in Fig. 497.
The other edges (directedji-ol11 T to S) are called backward edges of the cut set (S, T),
and by the net flow through a cut set we mean the sum of the t10ws in the forward edges
minus the sum of the flows in the backward edges of the cut set.
CAUTION! Distinguish well between forward and backward edges in a cut set and in
a path: (5, 2) in Fig. 497 is a backward edge for the cut shown but a forward edge in the
path 1 - 4 - 5 - 2 - 3 - 6.
For the cut in Fig. 497 the net flow is II + 6 - 3 = 14. For the same cut in Fig. 495 (not
indicated there), the net flow is 8 + 4 - 3 = 9. In both cases it equals the flow f. We claim
that this is not just by chance, but cuts do serve the purpose for which we have introduced them:
THEOREM 1
Net Flow in Cut Sets
AllY gil'e1l flow ill a network G is the net flow through a1lY cut set (S, T) of G.
PROOF
By Kirchhoff's law (2), multiplied by - L at a vertex i we have
(4)
L
fij -
j
L
fh
1
~~
Oulilow
Inflow
=
[0f
if i =I-
05,
if i = s.
t,
SEC. 23.6
977
Flows in Networks
Here we can sum over j and I from 1 to 11 (= number of vertices) by putting fij = 0 for
j = i and also for edges without flow or nonexisting edges; hence we can write the two
sums as one,
if i =I- s, t,
if i
=
s.
We now sum over all i in S. Since s is in S, this sum equals f:
:L :L (fij -
(5)
fji)
=
f·
iES jEV
We claim that in this sum, only the edges belonging to the cut set contlibute. Indeed,
edges with both ends in T cannot contribute, since we sum only over i in S; but edges
(i,j) with both ends in S contribute + fij at one end and - fij at the other, a total contribution
of O. Hence the left side of (5) equals the net flow through the cut set. By (5), this is equal
•
to the flow f and proves the theorem.
This theorem has the following consequence. which we shall also need later in this section.
THEOREM 2
Upper Bound for Flows
A pow
PROOF
f
ill a network G cannot exceed the capacity of any cut set (S, 1) in G.
By Theorem I the flow f equals the net flow through the cut set. f = f 1 - f 2' where f 1
is the sum of the flows through the forward edges and f2 (~ 0) is the sum of the flows
through the backward edges of the cut set. Thus f ~ fl' Now f 1 cannot exceed the sum
of the capacities of the forward edges; but this sum equals the capacity of the cut set, hy
definition. Together, f ~ cap (S, 1), as asserted.
•
Cut sets will now bring out the full importance of augmenting paths:
Main Theorem. Augmenting Path Theorem for Flows
THEOREM 3
A pow from s to t in a network G is maximum (f and only (f there does not exist a
flow augmenting path s ~ t ill G.
PROOF
(a) If there is a flow augmenting path P: s ~ t, we can use it to push through it an
additional flow. Hence the given t10w cannot be maximum.
(b) On the other hand, suppose that there is no flow augmenting path s ~ t in G. Let
So be the set of all vertices i (induding .1') such that there is a flow augmenting path s ~ i,
and let To be the set of the other vertices in G. Consider any edge (i, j) with i in So and
j in To. Then we have a t10w augmenting path s ~ i since i is in So, but s ~ i ~ j is not
t10w augmenting because j is not in So. Hence we must have
forward
(6)
if (i, j) is a [
edge of the path s
backward
~
i ~ j.
978
CHAP. 23
Graphs. Combinatorial Optimization
Otherwise we could use (i, j) to get a flow augmenting path s ---+ ; ---+ j. Now (So, To)
defines a cut set (since I is in To: why?). Since by (6), forward edges are used to capacity
and backward edges carry no flow, the net flow through the cut set (So, To) equals the
sum of the capacities of the forward edges. which is cap (So. To) by definition. This net
t10w equals the given flow f by Theorem 1. Thus f = cap (So, To). We also have
f ~ cap (So, To) by Theorem 2. Hence f must be maximum since we have reached
~w~.
•
The end of this proof yields another basic result (by Ford and Fulkerson, Canadian JOlln1al
of Mathematics 8 (1956), 399-404), namely. the so-called
THEOREM 4
Max-Flow Min-Cut Theorem
The maximum flow ill any network G equals the capacity of a "minimum cut set"'
(= a cut set of minimum capacity) in G.
PROOF
We have just seen that f = cap (So, To) for a maximum flow f and a suitable cut set
(So, To). Now by Theorem :2 we also have f ~ cap (S. T) for this f and any cut set (S, T)
in G. Together, cap (So, To) ~ cap (S, n. Hence (So, To) is a minimum cut set.
The existence of a maximum flow in this theorem follows for rational capacities from
the algorithm in the next section and for arbitrary capacities from the Edmonds-Karp BFS
also in that section.
•
The two basic tools in connection with networks are flow augmeming paths and cut sets.
In the nexl section we show how flow augmenting paths can be used in an algorithm for
maximum flows.
-
_
•... _.... _....lA__.........._____
.-.......
..--
= _ _ .....- . · . . . . .
11-41
FLOW AUGMENTING PATHS
3.
Find flow augmenting paths:
1.
2.
4.
SEC. 23.7
Maximum Flow:
Ford-Fulkerson Algorithm
~! MAXIMUM FLOW
Find the maximum flow by inspection:
S. In Prob. 1.
6. In Prob. 2.
7. In Prob. 3.
8. In Prob. 4.
!9-11! CAPACITY
In Fig. 495 find T and cap (5. T) if 5 equals
9. [1,2.31
10. [I. 2.4.51
11. [1, 3, 51
12. Find a minimum cut set in Fig. 495 and verify that its
capacity equals the maximum flow I = 14.
13. Find examples of flow augmenting paths and the
maximum flow in the network in Fig. 498.
lB~
CAPACITY
In Fig. 498 find T and cap (5. T) if 5 equals
14. [1,2,41
23.7
Maximum Flow:
979
15. [L 2. 4. 6 I
16. [1,2.3.4,51
17. In Fig. 498 find a minimum cut set and its capacity.
Fig. 498.
18. Why are backward edge~ not considered III the
definition of the capacity of a cut set?
19. In which case can an edge U, j) be used as a forward
as well as a backward edge of a path in a network with
a given flow?
20. (Incremental network) Sketch the network in Fig.
498, and on each edge (i,j) write Cij - Iij and Iij' Do
you recognize that from this "incremental network" one
can more easily see flow augmenting paths?
Ford-Fulkerson Algorithm
Flow augmenting paths, as discussed in the last section. are used as the basic tool in the
Ford-Fulkerson4 algorithm in Table 23.8 on the next page in which a given flow (for instance,
zero flow in all edges) is increased until it is maximum. The algOlithm accomplishes the
increa-;e by a stepwise construction of flow augmenting paths. one at a time. until no further
such paths can be constructed, which happens precisely when the tlow is maximum.
In Step I, an initial t10w may be given. In Step 3, a vertex j can be labeled if there is
an edge (i. j) with i labeled and
("forward edge")
or if there is an edge (j, i) with i labeled and
f ·· > 0
• JZ
("backward edge").
To scan a labeled vertex i means to label every unlabeled vertex j adjacent to i that can
be labeled. Before scanning a labeled vertex i, scan all the vertices that got labeled before
i. This BFS (Breadth First Search) strategy was suggested by Edmonds and Karp in
1972 (Journal ofrhe Associariollfor Compllring Machinery 19, 248-64). It has the effect
that one gets shortest possible augmenting paths.
4LESTER RANDOLPH FORD (horn 1927) and DELBERT RAY FULKERSON (1924-1976), American
mathematicians known for their pioneering work on flow algorithms.
980
CHAP. 23
Graphs. Combinatorial Optimization
Table 23.8
Ford-Fulkerson Algorithm for Maximum Flow
Cllllllcllll1l JOl/mlll of Mathematics 9 (1957),210--218
I
ALGORITHM FORD-FULKERSON
[G = (V, E), vertices 1 (= s) . .... 11 (= t). edges (i.j), Cij]
This algorithm computes the maximum flow in a network G with source s. sink t. and
capacities Cij > 0 of the edges (i, j).
INPUT: 11, s = I, t = 11. edges (i, j) of G.
OUTPUT: Maximum flow f in G
1. Assign an initial flow
fij
(for instance,
Cij
fij =
0 for all edges), compute f.
2. Label s by 0. Mark the other veltices "lIlllabeled."
3. Find a labeled vertex i that has not yet been scanned. Scan i as follows. For every
unlabeled adjacent veltexj, if Cij > fij' compute
if i = 1
L11j
and
j,j
=
[
.
mm (Ll·l' Ll 1J.)
and labelj with a ':forward label" (i+, L1j ); or if
fji
if i > I
> 0, compute
and labelj by a "backward label"' (C, Ll).
If no such j exists then OUTPUT f. Stop
[f is the maximum flmr.]
Else continue (that is, go to Step -l).
4. Repeat Step 3 until t is reached.
[This gives a flow allgmellting path P: s ~ t.]
[f it is impossible to reach t then OUTPUT f. Stop
[f is the maximum flow.J
Else continue (that is. go to Step 5).
5. Backtrack the path P, using the labels.
6. Using P, augment the existing flow by !::"t. Set f
7. Remove all labels from veltices 2, ... ,
11.
=
.f + Ll,.
Go to Step 3.
End FORD-FULKERSON
E X AMP L ElFord-Fulkerson Algorithm
Applying the Ford-Fulkerson algorithm, determine the maximum flow for the network in Fig. 499 (which is
the same as that in Example I. Sec. 23.6. ~o that we can compare).
Solu lion. The algorithm proceeds as follows.
1. An initial flow.f = 9 is given.
2. Label
of
(=
I) by 0. Mark 2, 3, 4. 5, 6 "unlabeled."
SEC. 23.7
Maximum Flow:
981
Ford-Fulkerson Algorithm
Fig. 499.
Network in Example 1 with capacities (first numbers) and given flow
3. Scan I.
Compute J. 12 = 20 - 5 = IS = J.2. Label 2 by (1 +, IS).
Compute J. 14 = 10 - 4 = 6 = ,).4' Label 4 by (1 +. 6).
4. Scan 2.
Compute .1.23 = II - 8 = 3. ~3 = min (~2' 3) = 3. Label 3 by (2+. 3).
Compute
~5 =
min (tl. 2• 3) = 3. LabelS by (T, 3).
Scan 3.
Compute .1.36 = 13 - 6 = 7, J. 6 = ~t = min (~3' 7) = 3. Label 6 by (3+, 3).
5. P: I - 2 - 3 - 6
6.
J.t
1
(= t)
is a flow augmenting path.
=
3. Augmentation gives
= 9 + 3 = 12.
h2 = 8. 123 = II. 136
9. other /;j unchanged. Augmented flow
7. Remove labels on vertices 2..... 6. Go 10 Step 3.
3. Scan I.
Compute .112 = 20 - 8 = 12 = .:).2' Label 2 by (I +, 12).
Compute ~14 = 10 - 4 = Ii = tl. 4. Label 4 by 11 +, 6).
4. Scan 2.
Compute J. 5 = min (.1. 2, 3) = 3. LabelS by (2-, 3).
Scan 4. [No I'ertex left forlabelillg.]
Scan 5.
Compute,).3 = min (.15 , 2) = 2. Label 3 by (5-. 1).
Scan 3.
Compute J.36 = 13 - 9 = 4 . .16 = min (.1. 3 , 4) = 2. Label Ii by (3+, 2).
5. P: I - 2 - 5 - 3 - 6
(= t)
6. .1.r. = 2. Augmentalion gives
flow 1 = 12 + 2 = 14.
7. Remove labels on
vertice~
is a flow augmenting path.
h2 = 10./52
=
I. i35 = 0, 136
=
II, other
Ji]
unchanged. Augmented
2, ... , Ii. Go to Step 3.
One can now scan I and then scan 2, as before, but in scanning 4 and then 5 one finds that no vertex is left for
labeling. Thus one can no longer reach 1. Hence the flo" obtained (Fig. 500) is maximum, in agreement with
our result in the last section.
•
Fig. 500.
Maximum flow in Example 1
982
CHAP. 23
Graphs. Combinatorial Optimization
1. Do the computations indicated near the end of Example
1 in detail.
2. Solve Example 1 by Ford-Fulkerson with initial now
O. Is it more work than in Example I?
3. Which are the "bottleneck" edges by which the flow
in Example 1 is actually limited? Hence which
capacities could be decreased without decreasing the
maximum How?
14. If the Ford-Fulkerson algorithm stops without reaching
t. sho~ that the edges with one end labeled and the
other end unlabeled form a cut set (S. T) whose
capacity equals the maximum flow.
15. (Several sources and sinks) If a network has several
sources Sl' . . . , Sk' sho\\ that it can be reduced to the
case of a single-source network by introducing a new
vertex S and connecting S to Slo • . • • Sk by k edges of
capacity Similarly if there are several sinks. lllustrate
tlus idea by a network with two sources and two sinks.
16. Find the maximum flow in the network in Fig. 50 I with
two sources (factories) and two sinks (consumers).
17. Find a minimum cut set in Fig. 499 and its capacity.
18. Show that in a network G with all Cij = I, the maximum
flow equals the number of edge-disjoint paths s ~ t.
19. In Prob. 17, the cut set contains precisely all forward
edges used to capacity by the maximum How
(Fig. 500). Is this just by chance?
20. Show that in a network G with capacities all equal to
I, the capacity of a minimum cut set (S, T) equals the
minimum number q of edges whose deletion destroys
all directed paths S ~ t. (A directed path v ~ w is a
path in which each edge has the direction in which it
is traversed in going from v to w.)
0/:).
14-71
MAXIMUM FLOW
Find the maximum How by Ford-Fulkerson:
4. In Prob. 2, Sec. 23.6.
5. In Prob. I, Sec. 23.6.
6. In Prob. 4, Sec. 23.6.
7. In Prob. 3, Sec. 23.6.
8. What is the (simple) reason that Kirchhoffs law is
preserved in augmenting a flow by the use of a flow
augmenting path?
9. How does Ford-Fulkerson prevent the fOimation of
cycles?
10. How can you see that Ford-Fulkerson follows a BFS
technique?
11. Are the consecutive How augmenting paths produced
by Ford-Fulkerson unique"!
12. (Integer flow theorem) Prove that if the capacities in
a network G are integers. then a maximum How exists
and is an integer.
13. CAS PROBLEM. Ford-Fulkerson. Write a program
and apply it to Probs. 4-7.
23.8
Bipartite Graphs.
Fig. 501.
Problem 16
Assignment Problems
From digraphs we return to graphs and discuss another impOitant class of combinatOlial
optimization problems that arises in assignment problems of workers to jobs, jobs to
machines, goods to storage, ships to piers. classes to classrooms, exams to time periods,
and so on. To explain the problem, we need the following concepts.
A bipartite graph G = (V, E) is a graph in which the veltex set V is partitioned into two
sets 5 and T (without common elements, by the definition of a partition) such that every
edge of G has one end in 5 and the other in T. Hence there are no edges in G that have both
ends in 5 or both ends in T. Such a graph G = (V, E) is also written G = (5, T; E).
Figure 502 shows an illustration. V consists of seven elements, three workers a, b, c,
making up the set 5, and four jobs I, 2. 3, 4, making up the set T. The edges indicate that
worker {/ can do the jobs 1 and 2, worker b the jobs I, 2, 3, and worker c the job 4. The
problem is to assign one job to each worker so that every worker gets one job to do. This
suggests the next concept, as follows.
SEC. 23.8
Bipartite Graphs.
DEFINITION
983
Assignment Problems
Maximum Cardinality Matching
A matching in G = (S. T; E) is a set M of edges of G such that no two of them
have a vertex in common. If M consists of the greatest possible number of edges.
we call it a maximum cardinality matching in G.
For in<;tance, a matching in Fig. 502 is Ml = {(a. 2), (b. l)}. Another is M2 = {(a, 1),
(b, 3), (c, 4)}; obviously, this is of maximum cardinality.
s
T
:~:
c
____
4
Fig. 502. Bipartite graph in the assignment
of a set 5 = {a, b, c} of workers
to a set T = {l, 2. 3. 4} of jobs
A vertex v is exposed (or /lot covered) by a matching M if v is not an endpoint of an
edge of M. This concept. which always refers to some matching, will be of interest when
we begin to augment given matchings (below). If a matching leaves no vertex exposed.
we call it a complete matching. Obviously, a complete matching can exist only if Sand
T consist of the same number of vertices.
We now want to show how one can stepwise increase the cardinality of a matching M
until it becomes maximmn. Central in this task is the concept of an augmenting path.
An alternating path is a path that consists alternately of edges in M and not in M
(Fig. 503A). An augmenting path is an alternating path both of whose endpoints (a and b
in Fig. 503B) are exposed. By dropping from the matching M the edges that are on an
augmenting path P (two edges in Fig. 503B) and adding to M the other edges of P (three
in the figure), we get a new matching, with one more edge than M. This is how we use
an augmenting path in augmenting a given matching by one edge. We assert that this
will always lead, after a number of steps, to a maximum cardinality matching. Indeed, the
basic role of augmenting paths is expressed in the following theorem.
(Al Alternating path
(8) Augmenting path P
Fig. 503. Alternating and augmenting paths.
Heavy edges are those belonging to a matching M.
CHAP. 23
984
THEOREM 1
Graphs. Combinatorial Optimization
Augmenting Path Theorem for Bipartite Matching
A matching M ill a bipartite graph G = (S, T; E) is of mllximum cardillality
only if there does not exist an augmenting path P witlz respect to M.
PROOF
if alld
(a) We show that if such a path P exists. then M is not of maximum cardinality. Let P
have q edges belonging to M. Then P has q + I edges not belonging to M. (In Fig. 503B
we have q = 2.) The endpoints a and b of P are exposed. and all the other vertices on P
are endpoints of edges in M. by the definition of an alternating path. Hence if an edge of
M is not an edge of P. it cannot have an endpoint on P since then M would not be a
matching. Consequently. the edges of M not on P. together with the q + 1 edges of P not
belonging to M form a matching of cardinality one more than the cardinality of M because
we omitted q edges from M and added q + I instead. Hence M cannot be of maximum
cardinality.
(b) We now show that if there is no augmenting path for M, then M is of maximum
cardinality. Let M';' be a maximum cardinality matching and consider the graph H
consisting of all edges that belong either to M or to M*. but not to both. Then it is possible
that two edges of H have a vertex in common. but three edges cannot have a vertex in
common since then two of the three would have to belong to M (or to M*), violating that
M and M-;' are matchings. So every v in V can be in common with two edges of H or with
one or none. Hence we can characterize each "component" (= maximal cOllllected subset)
of H as follows.
(A) A component of H can be a closed path with an even number of edges (in the case
of an odd number, two edges from M or two from M* would meet. violating the matChing
property). See (A) in Fig. 504.
(B) A component of H can be an open path P with the same number of edges from M
and edges from M*, for the following reason. P must be alternating. that is, an edge of
M is followed by an edge of M*. etc. (since M and M':' are matchings). Now if P had an
edge more from M*, then P would be augmenting for M [see (B2) in Fig. 504].
contradicting our assumption that there is no augmenting path for M. If P had an edge
more from M, it would be augmenting for M'~ [see (B3) in Fig. 504]. violating the
maximum cardinality of M*. by part (a) of this proof. Hence in each component of H. the
two matchings have the same number of edges. Adding to this the number of edges that
belong to both M and M* (which we left aside when we made up H), we conclude that
M and M* must have the same number of edges. Since M* is of maximum cardinality,
this shows that the same holds for M, as we wanted to prove.
•
,
(Al
.
......
'"._---_.
-EdgefromM
--.
- - - - Edge from M'"
(81)
. - - - -_ _ . - - - . _ - '---"-'_--..
(82)
. - - - ...__-
(83)
_ _ _ . - - - . ~'---''''''''''--4.
Fig. 504.
__ - - - . -
.--- ..
(Possible)
(Augmenting for M)
(Augmenting for M")
Proof of the augmenting path theorem for bipartite matching
SEC. 23.8
Bipartite Graphs.
Assignment Problems
985
This theorem suggests the algorithm in Table 23.9 for obtaining augmenting paths. in
which vertices are labeled for the purpose of backtracking paths. Such a label is ill additiol1
to the number of the vertex, which is also retained. Clearly, to get an augmenting path.
one must start from an exposed vertex. and then trace an alternating path until one arrives
at another exposed vertex. After Step 3 all vertices in S are labeled. In Step 4. the set T
contains at least one exposed vertex. since otherwise we would have stopped at Step I.
Table 23.9
Bipartite Maximum Cardinality Matching
ALGORITHM MATCHING [G
=
(S, T; E), M, n]
This algorithm determines a maximum cardinality matching M in a bipartite graph G by
augmenting a given matching in G.
INPUT: Bipartite graph G = (S, T; E) with ve11ices I, ...
instance, M = 0)
OUTPUT: Maximum cardinality matching M in G
,11,
marching Min G (for
1. If there is no exposed vertex in S then
OUTPUT M. Stop
[M is of maxilllulIl cardillality ill G.]
Else label all exposed vertices ill S with 0.
2. For each i in S and edge U, j)
l10t
in M, label j with i, unless already labeled.
3. For each 11011exposed j in T. label i with j, where i is the other end
of the unique edge U. j) in M.
-I. Backtrack the alternating paths P ending on an exposed vertex in T
by using the labels on the ve11ices.
5. If no P in Step 4 is augmenting then
OUTPUT M. Stop
[M is of maximum cardinality ill G.]
Else augment M by using an augmenting path P.
Remove all labels.
Go to Step I.
End MATCHING
E X AMP L E 1
Maximum Cardinality Matching
Is the matching Ml in Fig. SOSa of maximum cardinality? If not. augment it until maximum cardinality is reached.
S
T
3
3
(aJ Given graph
and matChing M J
Fig. 505.
5 3
7 2
o
B 3
4
(b) Matching M2
and new labels
Example 1
986
CHAP. 23
Graphs. Combinatorial Optimization
Solution.
We apply the algorithm.
1. Label I and.J. with 0.
2. Label 7 with I. Label 5. 6. !l with 3.
3. Label 2 with 6, and 3 with 7.
[All I'afices are now labeled
(IS
shown in Fig. 474a.]
4. PI: 1 - 7 - 3 - 5. [By backtracking, PI is augmenting.]
P 2: I - 7 - 3 - 8. [P2 is lIugmellfillg.j
5. Augment MI by using Pl' dropping (3,7) from MI and including (I, 7) and (3. 5).
Remove all
label~.
Go to Step I.
Figure 474b shows the resulting matching M2 = {(I. 7). (2, 6). (3. 5)j.
1. Label.J. with 0.
2. Label 7 with 2. Label 6 and !l with 3.
3. Label I with 7. and 2 with 6. and 3 with 5.
4. P 3 : 5 - 3 - 8. [P3 is aitematillg but
/lot
aug11lellfing.]
•
5. Stop. M2 is otl1lllxi11l11111 cardillality (namely, 3).
--..
11-6/
IS-lO I
BIPARTITE OR NOT?
Are the following graphs bipartite? If you answer is yes,
find S and T.
1.~
2.
AUGMENTING PATHS
hnd an augmenting path:
cp------<f
s.:\:
dr---4
~
9.
~
~
3.~
~
10.
7
111-131
MAXIMUM CARDINALITY MATCHING
Augmenting the given matching, find a maximum
cardinality matching:
11. In Prob. 9.
12. In Prob. 8.
7. Can you obtain the answer to Prob. 3 from that to
Prob. I?
13. In Prob. 10.
987
Chapter 23 Review Questions and Problems
14. (Scheduling and matching) Three teachers Xl, X2' -'3
teach four classes )'1, Y2, Y3, Y4 for these numbers of
periods:
Y1
Y2
y
.3
1
0
0
I
1
1
1
1
Xl
X2
-'"3
"
.4
1
Show that this arrangement can be represented by a
bipartite graph G and that a teaching schedule for one
period corresponds to a matching in G. Set up a
teaching schedule with the smallest possible number of
periods.
15. (Vertex coloring and exam scheduling) What is
the smallest number of exam periods for six subjects
a, b, c, d, e, t if some of the students simultaneously
take a, b, t, some c, d, e, some a, c, e, and some c, e?
Solve this as follows. Sketch a graph with six vertices
a, ... , t and join vertices if they represent subjects
simultaneously taken by some students. Color the
vertices so that adjacent vertices receive different
colors. (Use numbers L 2, ... instead of actual colors
if you want) What is the minimum number of colors
you need? For any graph G, this minimum number is
called the (vertex) chromatic number Xv(G). Why is
this the answer to the problem? Write down a possible
schedule.
16. How many colors do you need in vertex coloring the
graph in Prob. 5?
17. Show that all trees can be vertex colored with two
colors.
18. (Harbor management) How many piers does a
harbor master need for accommodating six cruise ships
51, " ' , 56 with expected dates of arrival A and
departure D in July, (A, D) = (10, 13), (13, 15),
(14, 17), (\2, 15), (16, 18), (\4, 17), respectively, if
each pier can accommodate only one ship, arrival being
at 6 a:m and departures at 11 p:m? Hint. Join 5i and 5j
by an edge if their intervals overlap. Then color
vertices.
ship~
51> ... , 55 had to be accommodated?
20. (Complete bipartite graphs) A bipartite graph
G = (5, T: E) is called complete if every vertex in 5
is joined to every vertex in Tby an edge, and is denoted
by K n1 ,,%, where n1 and n2 are the numbers of vertices
in 5 and T, respectively. How many edges does this
graph have?
21. (Planar graph) A planar graph is a graph that can be
drawn on a sheet of paper so that no two edges cross.
Show that the complete graph K4 with four vertices is
planar. The complete graph K5 with five vertices is not
planar. Make this plausible by attempting to draw K5
so that no edges cross. Interpret the result in terms of
a net of roads between five cities.
22. (Bipartite graph K 3 ,3 not planar) Three factories 1,
2, 3 are each supplied underground by water, gas, and
electricity, from points A, E, C, respectively. Show that
this can be represented by K 3 •3 (the complete bipartite
graph G = (5. T; £) with 5 and T consisting of three
vertices each) and that eight of the nine supply lines
(edges) can be laid out without crossing. Make it
plausible that K 3 .3 is not planar by attempting to draw
the ninth line without crossing the others.
23. (Four- (vertex) color theorem) The famous Jour-color
theorem states that one can color the vertices of any
planar graph (so that adjacent vertices get different
colors) with at most four colors. It had been conjectured
for a long time and was eventually proved in 1976
by Appel and Haken [Illinois J. Math 21 (1977),
429-5671. Can you color the complete graph K5 with
four colors? Does the result contradict the four-color
theorem? (For more details, see Ref. [F8] in App. I.)
24. (Edge coloring) The edge chromatic number xeCG) of
a graph G is the minimLUll number of colors needed for
coloring the edges of G so that incident edges get
different colors. Clearly, Xe(G) ;;; max d(u), where d(lI)
is the degree of vertex u. If G = (5, T; E) is bipartite,
the equality sign holds. Prove this for K n .n .
25. Vizing's theorem states that for any graph G (without
multiple edges!). max d(u) ~ Xe(G) ~ max d(u) + I.
Give an example of a graph for which Xe( G) does
exceed max d(u).
19. What would be the answer to Prob. 18 if only the five
.. - :«;::.,w.
:::''''==::== S T ION SAN D PRO B L EMS
1. What is a graph? A digraph? A tree? A cycle? A path?
2. State from memory how you can handle graphs and
digraphs on computers.
3. Describe situations and problems that can be modeled
using graphs or digraphs.
4. What is a shortest path problem? Give applications.
5. What is BFS? DFS? In what connection did these
concepts occur?
6. Give some applications in which spanning trees playa
role.
7. What are bipartite graphs? What applications motivate
this concept?
CHAP. 23
988
Graphs. Combinatorial Optimization
8. What is the traveling salesman problem?
9. What is a network? What optimization problems are
connected with it?
10. Can a forward edge in one path be a backward edge in
another path? [n a cut set? Explain.
11. There is a famous theorem on cut sets. Can you
remember and explain it?
26.
MATRICES FOR GRAPHS OR DIGRAPHS
112-171
}gt482
017 3
Find shortest paths by Dijkstra's algorithm:
2
810
27.
10
6
2
28
Find the adjacency matrix of:
12.
4
3
4
2
3
4
28.
15.
14.
fA
\!dJ
16.
17.~
~
B
~
29. (Shortest spanning tree) Find a shortest spanning tree
for the graph in Prob. 26.
30. Find a shortest
~panning
tree in Prob. 27.
31. Cayley's theorem states that the number of spanning
trees in a complete graph with II vertices is nn-2. Verify
this for f1 = 2. 3. 4.
32. Show that
0(1Il 3)
+
0(111 2) = 0(11/3).
133-341 MAXIMUM FLOW.
Find the maximum flow. where the given numbers are
capacities:
33.
34.
21. Make a vertex incidence list of the digraph in Prob. 13.
22. Make a vertex incidence list of the digraph in Prob. 14.
/23-28/ SHORTEST PATHS
Find a shortest path and its length by Moore's BFS
algorithm. assuming that all the edges have length I:
23.
/
.........
/.:~~>---:\
sVI~/
t
24.
35. Company A has offices in Chicago. Los Angeles. and
New York. Company B in Boston and New York.
Company C in Chicago, Dallas, and Los Angeles.
Represent this by a bipartite graph.
36. (Maximum cal'dinality matching). Augmenting the
given matching. find a maximum cardinality matching:
989
Summary of Chapter 23
. --.-
..... . - _
-.
.-II
.. -..
......
. . . . . . . . . _ _ .-..
Graphs and Combinatorial Optimization
Combinatorial optimization concerns optimIzation problems of a discrete or
combinatorial stmcture. It uses graphs and digraphs (Sec. 23.1) as basic tools.
A graph G = (V. E) consists of a set V of vertices VI, V2, .•.• V n • (often simply
denoted by I. 2 .... , /I) and a set E of edges el' e2' .... em. each of which connects
two ve11ices. We also write (i. j) for an edge with vertices i and j as endpoints. A
digraph (= directed graph) is a graph in which each edge has a direction (indicated
by an arrow). For handling graphs and digraphs in computers. one can use matrices
or lists (Sec. 23.1).
This chapter is devoted to important classes of optimization problems for graphs
and digraphs that all arise from practical applications. and corresponding algorithms,
as follows.
In a shortest path problem (Sec. 23.2) we determine a path of minimum length
(consisting of edges) from a vertex s to a ve11ex t in a graph whose edges (i.j) have
a "'length" lij > O. which may be an actual length or a travel time or cost or an
electrical resistance [if (i, j) is a wire in a net], and so on. Dijkstra's algorithm
(Sec. 23.3) or, when all lij = I, Moore's algorithm (Sec. 23.2) are suitable for
these problems.
A tree is a graph that is connected and has no cycles (no closed paths). Trees are
very important in practice. A ~P(/l111illg tree in a graph G is a tree containing all the
vertices of G. If the edges of G have lengths, we can detenrune a shortest spanning
tree, for which the sum of the lengths of all its edges is minimum, by Kruskal's
algorithm or Prim's algorithm (Sees. 23.4, 23.5).
A network (Sec. 23.6) is a digraph in whieh each edge (i. j) has a capacity
Cij > 0 [= maximum possible flow along (i. j)] and at one ve11ex, the source s. a
flow is produced that flows along the edges to a vertex t, the sink or target, where
the flow disappears. The problem is to maximize the tlow, for instance. by applying
the Ford-Fulkerson algorithm (Sec. 23.7), which uses flow augmenting paths
(Sec. 23.6). Another related concept is that of a cut set, as defined in Sec. 23.6.
A bipartite graph G = (V, E) (Sec. n.8) is a graph whose vertex set V consists
of two parts Sand T such that every edge of G has one end in S and the other in T,
so that there are no edges connecting vertices in S or ve11ices in T. A matching in
G is a set of edges. no two of which have an endpoint in common. The problem
then is to find a maximum cardinality matching in G. that is. a matching M that
has a maximum number of edges. For an algorithm. see Sec. 23.8.
r
PA RT
G
Probability,
Statistics
."
CHAPTER 24
Data Analysis. Probability Theory
CHAPTER 25
Mathematical Statistics
Probability theory (Chap. 24) provides models of probability distributions (theoretical
models of the observable reality involving chance effects) to be tested by statistical
methods, and it will also supply the mathematical foundation of these methods in Chap. 25.
Modern mathematical statistics (Chap. 25) has various engineering applications, for
instance, in testing materials, control of production processes, quality control of production
outputs, perfOlmance tests of systems, robotics. and automatization in general, production
planning, marketing analysis, and so on.
To this we could add a long list of fields of applications, for instance, in agriculture,
biology, computer science, demography, economics, geography, management of natural
resources, medicine, meteorology, politics, psychology, sociology, traffic control, urban
planning, etc. Although these applications are very heterogeneous, we shall see that most
statistical methods are universal in the sense that each of them can be applied in various
fields.
Additional Software for Probability and Statistics
See also the list of software at the beginning of Part E on Numerical Analysis.
DATA DESK. Data Description, Inc., Ithaca, NY. Phone 1-800-573-5121 or
(607) 257-1000, website at www.datadescription.com.
MINITAB. Minitab, Inc., College Park, PA. Phone 1-800-448-3555 or (814) 238-3280,
website at www.minitab.com.
SAS. SAS Institute, Inc., Cary, NC. Phone 1-800-727-0025 or (919) 677-8000, website
at www.sas.com.
991
992
PART G
Probability, Statistics
S-PLUS. Insightful Corporation, Inc., Seattle, W A. Phone 1-800-569-0123 or
(206) 283-8802, website at www.insightful.com.
SPSS. SPSS, Inc., Chicago, [L. Phone 1-800-543-2185 or (312) 651-3000, website at
www.spss.com.
STATISTIC.I\. StatSoft, Inc., Tulsa, OK. Phone (918) 749-1119, website at
www.statsoft.com.
24
CHAPTER
".
Data Analysis.
Probability Theory
We first show how to handle data numelically or in terms of graphs, and how to extract
information (average size. spread of data, etc.) from them. If these data are influenced by
"chance," by factors whose effect we cannot predict exactly (e.g., weather Jata, stock
prices, lifespans of tires, etc.), we have to rely on probability theory. This theory
originated in games of chance, such as flipping coins, rolling dice, or playing cards.
Nowadays it gives mathematical models of chance processes called random experiments
or, briefly, experiments. In such an experiment we observe a random variable X, that
is, a function whose values in a trial (a pelfOimance of an experiment) occur "by chance"
(Sec. 24.3) according to a probability distribution that gives the individual probabilities
with which possible values of X may occur in tlle long run. (Example: Each of the six
faces of a die should occur witll the same probability. l/6.) Or we may simultaneously
observe more than one random variable, for instance. height alld weight of persons or
hardness alld tensile strength of steel. This is discussed in Sec. 24.9, which will abo give
the basis for tlle mathematical justification of the statistical methods in Chap. 25.
Prereqllisite: Calculus.
References and Answers to Problems: App. L Part G, App. 2.
24.1
Data Representation.
Average.
Spread
Data can be represented numelically or graphically in various ways. For instance, your daily
newspaper may contain tables of stock plices and money exchange rates, curves or bar charts
illustrating economical or political developments, or pie charts showing how your tax dollar
is spent. And there are numerous other representations of data for special purposes.
In this section we discuss the use of standard representations of data in statistics. (For
these, software packages, such as DATA DESK and MINITAB, are available, and Maple
or Mathematica may also be helpful; see pp. 778 and 991) We explain corresponding
concepts and methods in terms of typical examples, beginning with
(1)
89
84
87
81
89
86
91
90
78
89
87
99
83
89.
These are 11 = 14 measurements of the tensile strength of sheet steel in kg/mm2, recorded
in the order obtained and rounded to integer values. To see what is going on, we sort
these data, that is, we order them by size,
(2)
78
81
83
84
86
87
87
89
89
89
89
90
91
99.
S0l1ing is a standard process on the computer; see Ref. [E25], listed in App. 1.
993
994
CHAP. 24
Data Analysis. Probability Theory
Graphic Representation of Data
We shall now discuss standard graphic representations used in statistics for obtaining
information on properties of data.
Stem-and-Leaf Plot
This is one of the simplesl but most useful representations of data. For (I) it is shown in
Fig. 506. The numbers in (1) range from 78 to 99; see (2). We divide these numbers into
5 groups, 75-79, 80-84, 85-89, 90-94, 95-99. The integers in the tens position of the
groups are 7,8,8,9,9. These form the stem in Fig. 506. The first lealis 8 (representing
78). The second leaf is 134 (representing 81, 83, 84), and so on.
The number of times a value occurs is called its absolute frequency. Thus 78 has
absolute frequency 1, the value 89 has absolute frequency 4, etc. The column to the extreme
left in Fig. 506 shows the cumulative absolute frequencies, that is, the sum of the absolute
frequencies of the values up to the line of the leaf. Thus, the number 4 in the second line
on the left shows that (1) has 4 values up to and including 84. The number 11 in the next
line shows that there are II values not exceeding 89, etc. Dividing the cumulative absolute
frequencies by 11 (= 14 in Fig. 506) gives the cumulative relative frequencies.
Histogram
For large sets of data, histograms are better in displaying the distribution of data than
stem-and-leaf plots. The principle is explained in Fig. 507. (An application to a larger
data set is shown in Sec. 25.7). The bases of the rectangles in Fig. 507 are the x-intervals
(known as class intervals) 74.5-79.5, 79.5-84.5, 84.5-89.5, 89.5-94.5, 94.5-99.5, whose
midpoints (known as class marks) are x = 77, 82, 87. 92, 97, respectively. The height
of a rectangle with class mark x is the relative class frequency frel(X), defined as the
number of data values in that class interval, divided by 1l (= 14 in our case). Hence the
areas of the rectangles are proportional to these relative frequencies, so that histograms
give a good impression of the distribution of data.
Center and Spread of Data: Median, Quartiles
As a center of the location of data values we can simply take the median, the data value
that falls in the middle when the values are ordered. In (2) we have 14 values. The seventh
of them is 87, the eighth is 89, and we split the difference, obtaining the median 88. (In
general, we would get a fraction.)
The spread (vmiabil ity) of the data values can be measured by the range R = Xmax - xmin'
the largesl minus the smallest data values, R = 99 - 78 = 21 in (2).
0.5
Leaf unit = 1.0
1
4
11
13
14
7
8
8
9
9
8
134
6779999
01
9
Fig. 506. Stem-and-Ieaf plot
of the data in (I) and (2)
0.4
0.3
0.2
0.1
o
x
Fig. 507. Histogram of the data in
(I) and (2) (grouped as in Fig. S06)
SEC. 24.1
Data Representation.
Average.
995
Spread
Better information gives the interquartile range IQR = qu - qv Here the upper
quartile qu is the middle value among the data values above the median. The lower
quartile qL is the middle value among the data values below the median. Thus in (2) we
have qu = 89 (the fourth value from the end), qL = 84 (the fOlllth value from the
beginning). and IQR = 89 - 84 = 5. The median is also called the middle quartile and
is denoted by qM. The rule of "splitting the difference" (just applied to the middle quartile)
is equally well used for the other quartiles if necessary.
Boxplot
The boxplot of (I) in Fig. 508 is obtained from the five numbers xmin, qv qM, qu, Xmax
just determined. The box extends from qL to quo Hence it has the height IQR. The position
of the median in the box shows that the data distribution is not symmetric. The two lines
extend from the box to Xmin below and to Xmax above. Hence they mark the range R.
Boxplots are particularly suitable for making comparisons. For example, Fig. 508 shows
boxplots of the data sets (I) and
(3)
91
89
93
91
87
94
92
85
91
90
96
93
89
91
92
93
93
94
96
(consisting of 11 = 13 values). Ordeling gives
(4)
85
87
89
89
90
91
91
(tensile strength, as before). From the plot we immediately see that the box of (3) is shorter
than the box of (I) (indicating the higher quality of the steel sheets!) and that qM is located
in the middle of the box (showing the more symmetric form of the distribution). Finally,
'"max is closer to qu for (3) than it is for (l), a fact that we shall discuss later.
For plotting the box of (3) we took from (4) the values xmin = 85, qL = 89, qM = 91,
qu = 93, Xmax = 96.
Outliers
An outlier is a value that appears to be uniquely different from the rest of the data set. It
might indicate that something went wrong with the data collection process. In connection
with qumtiles an outlier is conventionally defined as a value more than a distance of 1.5
IQR from either end of the box.
100
95
gqU
qM
90
qu
qM
85
I
qL
qL
80
75
Data set 0)
Fig. 508.
Data set (3)
Boxplots of data sets (1) and (3)
CHAP. 24
996
Data Analysis. Probability Theory
For the data in (1) we have lQR = 5, qL = 84, qu = 89. Hence outliers are smaller
than 84 - 7.5 or larger than 89 + 7.5, so that 99 is an outlier [see (2)]. The data (3) have
no outliers, as you can readily verify.
Mean.
Standard Deviation.
Variance
Medians and quartiles are easily obtained by ordering and counting. practically without
calculation. But they do not give full information on data: you can change data values to
some extent without changing the median. Similarly for the qUaItiles.
The average size of the data values can be measured in a more refined way by tlle mean
1
.t = -
(5)
n
2: Xj
11 j=l
I
= -
(xl
+
X2
+ ... +
X,.).
11
This is the aritllmetic mean of the data values, obtained by taking their sum and dividing
by the data si::e 11. Thus in (I).
x=
l~ (89
+ 84 + ... + 89) =
6~1 = 87.3.
Every data value contributes, and changing one of them will change the mean.
Similarly, the spread (variability) of tlle data values can be measured in a more refined
way by the standard deviation s or by its square, the variance
(6)
Thus, to obtain the variance of the data, take the difference .\1 - .r of each data value fi·om
the mean, square it, take the sum of these 1l squares, and divide it by /I - 1 (not 11, as we
motivate in Sec. 25.2). To get the standard deviation s, take the square root of S2.
For example, using.t = 61117, we get for the data (I) tlle variance
S2
=
113
[(89 - 6~1)2 + (84 -
6i l )2 + ... + (89
-
6i l )21
=
1~6 =
25.14.
Hence the standard deviation is s = V176/7 = 5.014. Note that the standard deviation
has the same dimension as the data values (kg/mm2, see at the beginning), which is an
advantage. On the other hand, the variance is preferable to the standard deviation in
developing statistical metl1Ods. as we shall see in Chap. 25.
CAUTION! Your CAS (Maple, for instance) may use lin instead of 1/(n - 1) in (6),
but the latter is better when 11 is small (see Sec. 25.2).
--.... -_...........--- --....--................
.•.
'1_1
~
°1
DATA REPRESENTATIONS
Represent the data by a stem-and-leaf plot, a histogram, and
a boxplot:
1. 20
2. 7
21
6
20
4
0
19
7
20
1
2
19
21
4
6
19
6
3.56 58 54 33
38 38 49 39
4. 12.1 10 12.4
14.7 9.9
5. 70.6 70.9 69.1
71.1 68.9 70.3
41
10.5
30
44
9.2
37
17.2
51
46
56
11.4
11.8
71.3 70.5 69.7 7l.5
69.2 71.2 70.4 72.8
69.8
SEC. 24.2
997
Experiments, Outcomes, Events
6. -0.52 0.11 -0.48 0.94 0.24 -0.19 -0.55
7. Reaction time [sec] of an automatic switch
2.3 2.2 2.4 1.5 2.3 2.3 2.4 2.1 2.5 2.4
2.6 1.3 2.5 2.1 2.4 2.2 2.3 2.5 2.4 2.4
8. Carbon content ['k J of coal
89
87
90
86
89
82
84
85
!i8
89
80
76
90
87
89
86
88
86
90
!i5
9. Weight offilled bottles [g] in an automatic tilling process
403
399
398
401
400
401
401
10. Gasoline consumption [gallons per mile] of six cars of
the same model
14.0
111-161
14.5
13.5
14.0
14.5
14.0
AVERAGE AND SPREAD
Find the mean and compare it with the median. Find the
standard deviation and compare it with the interquartile range.
24.2
11. The data in Prob. I.
12.
13.
14.
15.
The data in
The data in
The data in
The data in
16. 5
22
Prob.
Prob.
Prob.
Prob.
7 23
2.
5.
6.
9.
6. Why is
Ix - qMI
so large?
17. Construct the simplest possible data with.t = 100 but
qM = O.
18. (Mean) Prove that t must always lie between the
smallest and the largest data values.
19. (Outlier, reduced data) Calculate s for the data
4 I 3 10 2. Then reduce the data by deleting
the outlier and calculate s. Comment.
20. WRITING PROJECT. Average and Spread.
Compare QM' IQR and .t, s, illustrating the advantages
and disadvantages with examples of your own.
Experiments, Outcomes, Events
We now turn to probability theory. This theory has the purpose of providing mathematical
models of situations affected or even governed by "chance effects," for instance, in weather
foreca~ting, life insurance, quality of technical products (computers. batteries, steel sheets,
etc.). traffic problems, and, of course, games of chance with cards or dice. And the accuracy
of these models can be tested by suitable observations or experiments-this is a main
purpose of statistics to be explained in Chap. 25.
We begin by defining some standard terms. An experiment is a process of measurement
or observation, in a laboratory, in a factDlY, on the street, in nature, or wherever; so
"experiment" is used in a rather general sense. Our interest is in expeliments that involve
randomness, chance effects, so that we cannot predict a result exactly. A trial is a single
performance of an experiment. Its result is called an outcome or a sample point. n trial"
then give a sample of size 11 consisting of n sample points. The sample space S of an
experiment is the set of all pussible outcumes.
E X AMP L E S 1 - 6
Random Experiments. Sample Spaces
(1)
In'peeting a IightbuIb. S = {Defective. Nondefeetive}.
(2)
RoIling a die. S = {I. 2. 3.4,5. Ii}.
(3)
Measuring tensile strength of wire. S the numbers in some interval.
(4)
Measuring coppel' content of brass. S: 50% to 909<, say.
(5)
Counting daily traffic acciden(, in New York. S the integers in some interval.
(6) Asking for opinion about a new car model. S = (Like. Dislike. Undecided).
•
The subsets of S are called events and the outcomes simple events.
E X AMP L E 7
Events
In (2). evcms are A = {l. 3, 5}
events are {I}. {2} .... , {6}.
("Odd /lllmbe,.··), B =
{2. 4. 6}
("El'ell Illlmbe,."). C =
{5. 6}, etc. Simple
998
CHAP. 24
Data Analysis. Probability Theory
If in a trial an outcome a happens and a E A (a is an element of A), we say that A happens.
For instance, if a die turns up a 3, the event A: Odd number happens. Similarly, if C in
Example I happens (meaning 5 or 6 turns up), then D = {4, 5, 6} happens. Also note that
S happens in each triaL meaning that some event of S always happens. All this is quite natural.
Unions, Intersections,
Complements of Events
In connection with basic probability laws we shall need the following concepts and facts
about events (subsets) A, B, C, ... of a given sample space S.
The union A U B of A and B consists of all points in A or B or both.
The intersection A
n
B of A and B consists of all points that are in both A and B.
If A and B have no points in common. we write
AnB=0
where 0 is the empty set (set with no elements) and we call A and B mutually exclusive
(or disjoint) because in a trial the occurrence of A excludes thal of B (and conversely)if your die turns up an odd number, it cannot turn up an even number in the same trial.
Similarly, a coin cannot turn up Head and Tail at the same time.
Complement A C of A. This is the set of all the points of S Ilot in A. Thus,
An A C
= 0,
AU N = S.
In Example 7 we have A C = B, hence A U A C = {l, 2, 3, 4, 5, 6} = S.
Another notation for the complement of A is A (instead of A C ), but we shall not use this
because in set theory A is used to denote the closure of A (not needed in our work).
Unions and intersections of more events are defined similarly. The union
of events AI' ... , Am consists of all points that are in at least one Aj . Similarly for the
union A I U A2 U ... of infinitely many subsets A b A 2, ... of an ififinite sample space
S (that is, S consists of infinitely many points). The intersection
of AI> ... , Am consists of the points of S thar are in each of these events. Similarly for
the intersection Al n A2 n ... of infinitely many subsets of S.
Working with events can be illustrated and facilitated by Venn diagrams I for showing
unions, intersections. and complements, as in Figs. 509 and 510, which are typical
examples thal give the idea.
E X AMP L E 8
Unions and Intersections of 3 Events
In rolling a die, consider the events
A:
Number greater thall 3,
B:
Number less thall 6.
c:
Evell /lumber.
Then A n B = (4, 5), B n c = (2. 4). C n A = (4, 6), An B n c = {4). Can you sketch a Venn diagram
of this? Furthermore. A U B = S. hence A U B U C = S (why?).
•
I JOHN
VENN (1834-1923), English mathematician.
SEC. 24.2
999
Experiments, Outcomes, Events
s
s
Intersection A n B
UmonAuB
Fig. 509. Venn diagrams showing two events A and B in a sample space 5
and their union A U B (colored) and intersection A n B (colored)
Fig. 510.
A
--
---- ........
-- ....
.. ...........
11-91
_.......
=
Venn diagram for the experiment of rolling a die, showing 5,
{1, 3, 5}, C = {5, 6}, A U C = {l, 3, 5, 6}, A n C = {5}
...... ----
.-.
~.-----
SAMPLE SPACES, EVENTS
Graph a sample space for the expeliment:
1. Tossing 2 coins
2. Drawing 4 screws from a lot of right-handed and
left-handed screws
3. Rolling 2 dice
VENN DIAGRAMS
115-201
15. In connection with a trip to Europe by some students,
consider the events P that they see Paris, G that they
have a good time. and M that they run out of money.
and describe in words the events 1, .. " 7 in the
diagram.
G
4. Tossing a coin until the fIrst Head appears
5. Rolling a die until the first "Si,," appears
6. Drawing bolts from a lot of 20, containing one
defective D, until D is drawn. one at a time tmd
assuming sampling without replacement. that is,
bolts drawn are not returned to the lot
7. Recording the lifetime of each of 3 lightbulbs
8. Choosing a committee of 3 from a group of 5 people
Problem 15
9. Recording the daily maximum tempemture X and the
maximum air pressure Y at some point in a city
16. Using Venn diagrams, graph and check the mles
10. In Prob. 3, circle and mark the events A: Equal faces,
B: Sum exceeds 9, C: SUIll equals 7.
11. In rolling 2 dice, are the events A: SIIII1 divisible by 3
and B: Sum dil'isible by 5 mutually exclusive?
12. Answer the question in Prob. 11 for rolling 3 dice.
13. In Prob. 5 list the outcomes that make up the event E:
First "Six" in rolling at most 3 times. Describe E C •
14. List all 8 subsets of the sample space S = (a, b, c}.
A U (B
A
n
n
C) = (A U B)
(B U C) = (A
n
n
(A U C)
B) U (A
n
C)
17. (De Morgan's laws) Using Venn diagrams. graph and
check De Morgan's laws
(A U B)c = A C
(A
n
n
BC
B)c = A C U B C •
CHAP. 24
1000
Data Analysis. Probability Theory
(A C)C
18. Using a Venn diagram. show that A <;;; B if and only if
An B
=
A.
SC
A UN = S,
19. Show that, by the definition of complement, for any
subset A of a sample space S,
24.3
= A,
= 0,
0 c = S,
AnN = 0.
20. Using a Venn diagram, show that A <;;; B if and only if
AU B = B.
Probability
The "probability" of an event A in an experiment is supposed to measure how frequently
A is about to occur if we make many trials. If we flip a coin, then heads H and tails T
will appear about equally often-we say that Hand T are "equally likely." Similarly, for
a regularly shaped die of homogeneous material ("fair die") each of the six outcomes
1, ... , 6 will be equally likely. These are examples of experiments in which the sample
space S consists of finitely many outcomes (points) that for reasons of some symmetry
can be regarded as equally likely. This suggests the following definition.
First Definition of Probability
DEFINITION 1
If the sample space S of an experiment consists of finitely many outcomes (points)
that are equally likely, then the probability peA) of an event A is
(1)
peA) =
Number of points in A
Number of points in S
From this definition it follows immediately that. in particular,
(2)
E X AMP L E 1
peS)
=
1.
Fair Die
In rolling a fair die once. what is the probability peA) of A of obtaining a 5 or a 6? The probabilIty of B: "El'en
11l1llzber"?
Solutioll.
The six outcomes are equally likely. so that each has probability 1/6. Thus peA) = 2/6 = 1/3
•
because A = (5, 6J has 2 points, and PCB) = 3/6 = 112.
Definition 1 takes care of many games as well as some practical applications, as we shall
see, but celtainly not of all experiments, simply because in many problems we do not
have finitely many equally likely outcomes. To arrive at a more general definition of
probability, we regard probability as the coullterpart of relative frequellcy. Recall from
Sec. 24.1 that the absolute frequency f(A) of an event A in n trials is the number of times
A occurs, and the relative frequency of A in these trials is f(A)ln; thus
(3)
Number of times A occurs
Number of trials
SEC 24.3
1001
Probability
Now if A did not occur, then f(A) = O. If A always occurred. then f(A) =
the extreme cases. Division by 11 gives
11.
These are
(4*)
In particular, for A = S we have f(S) = 11 because S always occurs (meaning that some
event always occurs; if necessary, see Sec. 24.2, after Example 7). Division by 11 gives
(5*)
fre1(S) = I.
Finally. if A and B are mutually exclusive, they cannot occur together. Hence the absolute
frequency of their union A U B must equal the sum of the absolute frequencies of A and
B. Division by 11 gives the same relation for the relative frequencies,
(6*)
f re1(A U B) = f rei (A)
+
n
(A
fre1(B)
B
= 0).
We are now ready to extend the definition of probability to experiments in which equally
likely outcomes are not available. Of course, the extended definition should include
Definition 1. Since probabilities are supposed to be the theoretical cOllnterpmt of relative
frequencies, we choose the properties in (4*), (5*), (6"") as axioms. (Historically, such a
choice is the result of a long process of gaining experience on what might be best and
most practical.)
DEFINITION 2
General Definition of Probability
Given a sample space S, with each event A of S (subset of S) there is associated a
number P(A), called the probability of A. such that the following axioms of
probability are satisfied.
1. For every A in S,
(4)
0:0;:
peA)
:0;:
I.
2. The entire sample space S has the probability
(5)
peS) = 1.
3. For mutually exclusive events A and B (A
(6)
peA U B)
n
= peA) +
B = 0; see Sec. 24.2),
PCB)
(A
n
B = 0).
If S is infinite (has infinitely many points). Axiom 3 has to be replaced by
3'. For mutually exclusive events AI> A 2 • • . • ,
(6')
In the infinite case the subsets of S on which peA) is defined are restricted to form a
so-called u-algebra. as explained in Ref. (GR6j (not (G6]!) in App. 1. This is of no
practical consequence to us.
CHAP. 24
1002
Data Analysis. Probability Theory
Basic Theorems of Probability
We shall see that the axioms of probability will enable us to build up probability theory
and its application to statistics. We begin with three basic theorems. The first of them is
useful if we can get the probability of the complement A C more easily than peA) itself.
THEOREM 1
Complementation Rule
For an event A and its complemellt A C in a sample space S,
(7)
PROOF
peN)
I - peA).
By the definition of complement (Sec. 24.2). we have S
Hence by Axioms 2 and 3,
I
EXAMPLE 2
=
=
peS)
=
peA)
+ P(A
C
).
=
A U A C and A
n
AC
0.
•
thus
Coin Tossing
Five coin~ are tossed simultaneously. Find the probability of the event A: At least one head tums up. Assume
that the coins are fair.
Solution. Since each coin can [Urn up heads or mils, the sample space consists of 25 = 32 Olilcomes. Since
the coins are fair. we may assign the same probability (1/32) to each outcome. Then the event A C (No heads
c
c
tum LIp) consists of only 1 outcome. Hence P(A ) = 1/32, and the answer is peA) = 1 - P(A ) = 31/32. •
The next theorem is a simple extension of Axiom 3, which you can readily prove by
induction.
THEOREM 2
Addition Rule for Mutually Exclusive Events
For mutually exclusive events AI, ... , Am in a sample space S,
E X AMP L E 3
Mutually Exclusive Events
If the probability that on any workday a garage will get 10-20,21-30,31-40, over 40 cars to service is 0.20,
0.35, 0.25, 0.12, respectively, what is the probability that on a given workday the garage gets at least 21 cars
to service?
Solution. Since these are mutually exclusive events, Theorem 2 gives the answer 0.35 + 0.25 + 0.12 = 0.72.
Check this by the complementation rule.
•
In many cases, events will not be mutually exclusive. Then we have
THEOREM 3
Addition Rule for Arbitrary Events
For events A and B in
(9)
1I
sample space,
peA U B)
=
peA)
+ PCB)
- peA
n
B).
SEC. 24.3
1003
Probability
PROOF
C, D. E in Fig. 511 make up A U B and are mutually exclusive (disjoint). Hence by
Theorem 2.
peA U B)
= P( C) + P(D) +
This gives (9) because on the right P(C)
and peE) = PCB) - PW) = PCB) - peA
+ P(D) = peA) by Axiom 3 and disjointness;
n B). also by Axiom 3 and disjointness. •
B
A
Fig. 511.
peE).
Proof of Theorem 3
Note that for mutually exclusive events A and B we have A
by comparing (9) and (6),
(10)
P(0)
(Can you also prove this by (5) and
E X AMP L E 4
n
B
= 0 by definition and.
= O.
om
Union of Arbitrary Events
In tossing a fair die. what is the probability of getting an odd number or a number less than 4?
Solutioll.
Let A be the event "Odd number" and B the event "Numberless than 4." Then Theorem 3 gives
the answer
P(AUB)=~+~-~=~
because A
n
•
B = .. Odd lIl//uber less thall 4" = {I, 3}.
Conditional Probability.
Independent Events
Often it is required to find the probability of an event B under the condition that an event
A occurs. This probability is called the conditional probability of B given A and is denoted
by p(BIA). In this case A serves as a new (reduced) sample space, and that probability is
the fraction of peA) which corresponds to A n B. Thus
(11)
p(BIA)
=
PeA n B)
peA)
[peA)
*- 01.
[PCB)
*-
Similarly, the cunditiO/wl probability of A gil'en B is
p(AIB)
(12)
PCB)
Solving (II) and (12) for peA
THEOREM 4
= peA n B)
n
B), we obtain
Multiplication Rule
If A and B lire events in a sample space Sand peA) *- 0, PCB) *- O. then
(13)
peA
n
B)
=
P(A)P(BIA)
=
P(B)P(AIB).
0].
1004
E X AMP L E 5
CHAP. 24
Data Analysis. Probability Theory
Multiplication Rule
In producing screw~.letA mean "screw too slim" and B "screw too short." Let peA) = 0.1 and let the conditional
probability that a slim sere", is also too short be p(BIA) = n.2. What is the probability that a scre'" that we pick
randoml) from the lot produced will be both too slim and too short?
Solution.
PIA
n
B) = p(A)p(BIA) = 0.1 • 0.2 = 0.02 = 2'7c. by Theorem 4.
Independent Events.
•
If events A and B are such that
(14)
P(A
n
B) = P(A)P(B),
they are called independent events. Assuming P(A)
that in this case
p(AIB)
=
peA),
* O. P(B) * 0, we see from (II )-( 13)
P(BIA)
= PCB).
This means that the probability of A does not depend on the occurrence or nonoccurrence
of B, and conversely. This justifies the term "independent."
Independence of III Events.
Similarly,
111
events A I,
••• ,
Am are called independent if
(ISa)
as well as for every k different events A jl , Ah , ... , Aj ".
(ISb)
where k = 2. 3, ... ,
111 -
l.
Accordingly. three events A. B. C are independent if and only if
peA
n
B) = P(A)P(B),
n 0 = P(B)P(0,
P(C n A) = P(C)P(A).
PCB
(16)
peA
n
B
n0
= P(A)P(B)P(O.
Sampling. Our next example has to do with randomly drawing objects, one at a time,
from a given set of objects. This is called sampling from a popUlation, and there are
two ways of sampling, as follows.
1. In sampling with replacement, the object that was drawn at random is placed back
to the given set and the set is mixed thoroughly. Then we draw the next object at
random.
2. In sampling without replacement the object that was drawn is put aside.
E X AMP L E 6
Sampling With and Without Replacement
A box contains 10 screws. three of which are defective. Two screws are drawn at random. Find the probability
that none of the lWO screws is defective.
Solution.
We consider the event~
A: First drawn screll' 1100ldefectil'e.
B: Second drawl1 screw /101ulefectil'e.
SEC. 24.3
1005
Probability
Clearly. PIA) = -k becau,e 7 of the 10 screw, are nomlefective and we sample at ramI om. so thm each screw
has the same probability (to) of being picked. If we sample with replacement. the situation before the second
drawing is the same as at the beginning. and PCB) = -k. The events are independent. and the answer is
PeA
n
B) = P(A)P(B) = 0.7' 0.7 = 0.49 = 49<)}.
If we sample without replacement. then PIA) =
in the box, 3 of which are defective. Thus P(B/A)
PIA
n
-k, a~ before. If A has occurred. then there are 9 ~crews left
= ~ = ~. and 1l1eorem 4 yields the answer
B) =
-k . ~ =
47<)}.
•
Is it intuitively clear that this value mu,t be smaller than the preceding one'!
--.•.. .... - •••
"'A
1. Three screws are drawn at random from a lot of 100
screws. 10 of which are defective. Find the probability
that the screws drawn will be nondefective in drawing
(a) with replacement. (b) without replacement.
11. In roIling two fair dice. what is the probability of
2. In Prob. I find the probability of E: At least I defective
(i) directly. (ii) by using complements; in both cases
(a) and (b).
13. A motor drives an eiectIic generator. During a 30-day
period. the motor needs repair with probability 8%- and
the generator needs repair with probability 49'r. What
is the probability that during a given period. the entire
apparatus (consisting of a motor and a generator) will
need repair?
3. If we inspect paper by drawing 5 sheets without
replacement from every batch of 500. what is the
probability of getting 5 clean sheets although 2% of
the sheets contain spots') First guess.
4. Under what conditions will it make practically no
difference whether we sample with or witholll
replacement? Give numeric examples.
5. If you need a right -handed screw from a box containing
20 right-handed and 5 left-handed screws. what is the
probability that you get at least one right-handed screw
in drawing 2 screws with replacement?
6. If in Prob. 5 you draw without replacement. does the
probability decrease or increase? First think, then
calculate.
7. What gives the greater probability of hitting some target
at least once: (a) hitting in a shot with probability 112
and firing I shot. or (b) hitting in a shot with probability
1I4 and firing 2 shots? First guess. Then calculate.
8. Suppose that we draw cards repeatedly and with
replacement from a file of 100 cards. 50 of which refer
to male and 50 to female persons. What is the
probability of obtaining the second "female" card
before the third "male" card?
9. What is the complementary event of the event
considered in Prob. 8'1 Calculate its probability and use
it to check your result in Prob. 8.
10. In rolling two fair dice. what is the probability of
obtaining a sum greater than 4 but not exceeding 7'1
obtaining equal numbers or numbers with an even
product?
12. Solve Prob. II by considering complements.
14. If a circuit contains 3 automatic switches and we want
that. with a probability of 959'c. during a given time
interval they are all working. what probability of failure
per time interval can we admit for a single switch?
15. If a certain kind of tire has a life exceeding 25 000 miles
with probability 0.95. what is the probability that a set of
4 of these tires on a car will last longer than 25000 miles?
16. In Prob. 15. what is the probability that at least one of
the tires will not last for 25 000 miles?
17. A pressure control apparatus contains 4 valves. The
apparatus will not work unless all valves are operative.
If the probability of failure of each valve during some
interval of time is 0.03. what is the corresponding
probability of failure of the apparatus?
18. Show that if B is a subset of A, then P(B)
19. Extending Theorem 4. show that
PeA n B n C) = P(A)P(BIA)P(CiA
n
~
peA).
B).
20. You may wonder whether in (16) the last relation
follows from the others. but the answer is no. To see
this. imagine that a chip is drawn from a box containing
4 chips numbered 000,01 I. 101, 110. and let A. B. C
be the events that the first. second. and third digit.
respectively. on the drawn chip is 1. Show that then
the first three formulas in (16) hold but the last one
does not hold.
CHAP. 24
1006
Data Analysis. Probability Theory
24.4 Permutations and Combinations
Permutations and combinations help in finding probabilities peA) = alk by systematically
counting the number a of pointf> of which an event A consists; here, k is the number of
points of the sample space S. The practical difficulty is that a may often be surprisingly
large, so that actual counting becomes hopeless. For example, if in assembling some
instrument you need 10 different screws in a certain order and you want to draw them
randomly from a box (which contains nothing else) the probability of obtaining them in
the required order is only 1/3 628800 because there are
10!
=
I . 2' 3 ·4·5·6·7' 8 . 9 . 10
= 3628800
orders in which they can be drawn. Similarly, in many other situations the numbers of
orders, anangements, etc. are often incredibly large. (If you are unimpressed. take 20
screws-how much bigger will the number be?)
Permutations
A permutation of given things (elements or objects) is an arrangement of these things in
a row in some order. For example, for three letters a, b, c there are 3! = I . 2 . 3 = 6
permutations: abc, acb, bac, bca, cab, cba. This illustrates (a) in the following theorem.
THEOREM 1
Permutations
(a) Different t/zings. The lUt1llber of permutations of n different things taken
all at a time is
(1)
n! = I . 2 . 3 ...
l
(read "n.factorial").
Il
(b) Classes of equal things. If n given things can be divided into c classes of
alike things differing from class to class, then the number of permutations of
these things taken all at a time is
(2)
n!
(Ill
+
n2
+ ... +
llc
=
11)
where nj is the 1lumber of things ill the jth class.
PROOF
(a) There are 11 choices for filling the first place in the row. Then n - I things are still
available for filling the second place, etc.
(b) 171 alike things in class 1 make n 1 ! permutations collapse into a single permutation
(those in which class I things occupy the same 111 positions), etc., so that (2) follows
•
from (I).
SEC. 24.4
1007
Permutations and Combinations
E X AMP L E 1
Illustration of Theorem l(b)
If a box contains 6 red and 4 blue balls. the probability of drawing first the red and then the blue balls is
p
~
•
6!4!110! = 11210 = 0.5%.
A permutation of n things taken k at a time is a permutation containing only k of the
n given things. Two such permutations consisting of the ~ame k elements. in a different
order, are different, by definition. For example, there are 6 different permutations of the
three letters a, b, e, taken two letters at a time, ab, ae, be, ba, ea, eb.
A permutation of Il things taken k at a time with repetitions is an arrangement
obtained by putting any given thing in the first position, any given thing, including a
repetition of the one just used, in the second, and continuing until k positions are filled.
For example, there are 32 = 9 different such permutations of a, b, e taken 2 letters at a
time, namely, the preceding 6 permutations and aa, bb, ec. You may prove (see Team
Project 18):
THEOREM 2
Permutations
TIle number of different pennutations of n different things taken k at a time without
repetitions is
(3a)
n(n -
1)(n - 2) ... (n - k
+
1)
=
n!
(n - k)!
alld with repetitions is
(3b)
E X AMP L E 2
Illustration of Theorem 2
In a coded telegram the letters are arranged in groups of five letters, called words. From (3b) we see that the
number of different such words is
265 = II 8!B 376.
From (3a) it follows that the number of different such words containing each letter no more tiMn once is
26!/(26 - 5)!
= 26' 25 . 24 . 23' 22 = 7893600.
•
Combinations
In a permutation, the order of the selected things is essential. In contrast, a combination
of given things means any selection of one or more things without regard to order. There
are two kinds of combinations, as follows.
The number of combinations of Il different things, taken k at a time, without
repetitions is the number of sets that can be made up from the n given things, each set
containing k different things and no two sets containing exactly the same k things.
The number of combinations of Il different things, taken k at a time, with repetitions
is the number of sets that can be made up of k things chosen from the given n things,
each being used as often as desired.
CHAP. 24
1008
Data Analysis. Probability Theory
For example, there are three combinations of the three letters a, b, c, taken two letters
at a time, without repetitions. namely, lib, lIC, bc, and six such combinations with
repetitions, namely, lib, lIC, be, aa, bb, cc.
Combinations
THEOREM 3
The Ilumber of d~fferent combinations of 11 d!fferent things taken. k at a time. withollt
repetitiollS. is
(4a)
C) =
1) ... (n - k
n(n -
n!
+
I)
I ' 2· .. k
k!(n - k)!
lind the number of those combinlltions with repetitions is
(4b)
PROOF
E X AMP L E 3
The statement involving (4a) follows from the first part of Theorem 2 by noting that there
are k! permutations of k things from the given n things that differ by the order of the
elements (see Theorem I). but there is only a single combination of those k things of the
type characterized in the first statement of Theorem 3. The last statement of Theorem 3
can be proved by induction (see Team Project 18).
•
Illustration of Theorem 3
The number of samples of five lightbulbs that can be selected from a 1m of 500 bulbs is rsee (4a 11
500 . 499 . 498 . -1-97 . 496
500)
500!
= 255 244 687 600.
( 5
= 5!495! =
1·2·3'4·5
•
Factorial Function
In (I )-(4) the factorial function is basic. By definition.
(5)
O!
=
L
Values may be computed recursively from given values by
(6)
(n
+
l)!
=
(II
+
l)n!.
For large 11 the function is very large (see Table A3 in App. 5). A convenient approximation
for large 11 is the Stirling formula 2
(7)
2 jAMES
(e
STIRUI\G 0692-1770). Scots mathematician.
= 2.718 ... )
SEC. 24.4
1009
Permutations and Combinations
where - is read "asymptotically equal" and means that the ratio of the two sides of (7)
approaches 1 as 11 approaches infinity.
EXAMPLE 4
Stirling Formula
[
n!
By (7)
Exact Value
Relative Error
4!
I
IO!
23.5
3598696
2.422 79 . 1018
24
3628800
2 432 902 008 176 640 000
2.1%
0.8%
0.4%
l
20!
•
Binomial Coefficients
The binomial coefficients are defined by the formula
0) = o(a (k
(8)
1)(0 - 2) ... (0 - k
+
1)
~
(k
k!
0, integer).
The numerator has k factors. FUlthermore. we define
(~)
(9)
For integer
II
=
11
(~)
in particular,
= I,
= 1.
we obtain from (8)
(n ~ 0, 0 ~ k ~ n).
(10)
Binomial coefficients may be computed recursively, because
0
(11)
(
k
+
+
1)
1
~
(k
0, integer).
Formula (8) also yields
(k
(12)
~
0, integer)
(m > 0).
There are numerous further relations; we mention two important ones,
(13)
~
s=o
(k+S) = (l1+k)
k
k
+
(k ~
O. n ~ 1,
both integer)
1
and
(14)
(r
~
0, integer).
1010
CHAP. 24
Data Analysis. Probability Theory
1. List all pennutations of four digits I, 2, 3, 4, taken all
at a time.
2. List (a) all permutations, (b) all combinations without
repetitions, (c) all combinations with repetitions, of 5
letters G, e, i, 0, 1I taken 2 at a time.
3. In how many ways can we assign 8 workers to 8 jobs
(one worker to each job and conversdy)?
4. How many samples of 4 objects can be drawn from a
lot of 80 objects?
5. In how many different ways can we choose a
committee of 3 from 20 persons? First guess.
6. In how many different wa) s can we select a committee
consisting of 3 engineers. 2 biologists. and 2 chemists
from 10 engineers, 5 biologist~, and 6 chernists? First
guess.
7. Of a lot of 10 items, 2 are defective. (a) Find the number
of different samples of 4. Find the number of samples
of 4 containing (b) no defectives, (c) I defective, (d) 2
defectives.
8. If a cage contains 100 mice, two of which are male,
what is the probability that the two male mice will be
included if 12 mice are randomly selected?
9. An urn contains 2 blue, 3 green, and 4 red balls. We
draw I ball at random and put it aside. Then we draw
the next ball, and so on. Find the probability of drawing
at first the 2 blue balls, then the 3 green ones, and
finally the red ones.
10. By what factor is the probability in Prob. 9 decreased
if the number of balls is doubled (4 blue, etc.)?
11. Detennine the number of different bridge hands. (A
bridge hand consists of 13 cards selected from a full
deck of 52 cards.)
12. In how many different ways can 5 people be seated at
a round table?
13. If 3 suspects who committed a burglary and 6 innocent
persons are lined up, what is the probability that a
witne~s who is not sure and has to pick three persons
will pick the three suspects by chance? That the witness
picks 3 innocent persons by chance?
24.5
14. (Birthday problem) What is the probability that in a
group of 20 people (that includes no twins) at least two
have the same birthday. if we assume that the
probability of having birthday on a given day is 11365
for every day. First guess.
15. How many different license plates showing 5 symbols,
nanlely, 2 letters followed by 3 digits. could be made?
16. How many automobile registrations may the police
have to check in a hit-and-run accident if a witness
reports KDP5 and cannot remember the last two digits
on the license plate but is certain that all three digits
were different?
17. CAS PROJECT. Stirling formula. (a) Using (7),
compute approximate values of n! for 11 = I, ... ,20.
(b) Detennine the relative error in (a). Find an
empirical formula for that relative error.
(el An upper bound for that relative error is e 1/ 12n
Try to relate your empirical formula to this.
-
L.
(d) Search through the literature for further
infonnation on Stirling's fonnula. Write a short report
about your findings, arranged in logical order and
illustrated with numeric examples.
18. TEAM PROJECT. Permutations, Combinations.
(a) Prove Theorem 2.
(b) Prove the last statement of Theorem 3.
(e) Derive (11) from (8).
(d) By the binomial theorem,
so that llkbn - k has the coefficient (~). Can you
conclude this from Theorem 3 or is this a mere
coincidence?
(e) Prove (14) by using the binomial theorem.
(f) Collect further formulas for binomial coefficients
from the literature and illustrate them numerically.
Random Variables.
Probability Distributions
In Sec. 24.1 we considered frequency distributions of data. These distributions show the
absolute or relative frequency of the data values. Similarly, a probability distribution
or, briefly, a distribution, shows the probabilities of events in an experiment. The quantity
that we observe in an experiment will be denoted by X and called a random variable (or
SEC 24.5
1011
Random Variables. Probability Distributions
stochastic variable) because the value it will assume in the next trial depends on chance,
on randomness-if you roll a dice, you get one of the numbers from I to 6, but you don't
know which one will show up next. Thus X = Number a die tUnIS up is a random variable.
So is X = Elasticity of rubber (elongation at break). ("Stochastic" means related to chance.)
If we COUllt (cars on a road, defective screws in a production, tosses until a die shows
the first Six), we have a discrete random variable and distribution. If we measure
(electric voltage, rainfall, hardness of steel), we have a continuous random variable and
distribution. Precise definitions follow. In both cases the distribution of X is determined
by the distribution function
F(x)
(1)
= P(X ~ x);
this is the probability that in a trial, X will assume any value not exceeding x.
CAUTION! The terminology is not uniform. F(x) is sometimes also called the
cumulative distribution function.
For (I) to make sense in both the discrete and the continuous case we formulate
conditions as follows.
DEFINITION
r
Random Variable
A random variable X is a function defined on the sample space S of an experiment.
Its values are real numbers. For every number 11 the probability
P(X = a)
with which X assumes a is defined. Similarly, for any interval 1 the probability
P(X E I)
with which X assumes any value in 1 is defined.
Although this definition is very general, practically only a very small number of
distributions wil1 occur over and over again in applications.
From (I) we obtain the fundamental formula for the probability corresponding to an
interval a < x ~ b,
Pea < X
(2)
~
b) = F(b) - F(a).
This follows because X ~ a ("X assumes any value not exceeding a") and a < X ~ b
("X assumes any value in the illterval a < x ~ b") are mutually exclusive events, so that
by (1) and Axiom 3 of Definition 2 in Sec. 24.3
F(b)
= P(X ~ b) = P(X ~
=
F(a)
+
and subtraction of F(a) on both sides gives (2).
a)
+
Pea < X
Pea < X
~
b)
~
b)
1012
CHAP. 24
Data Analysis. Probability Theory
Discrete Random Variables and Distributions
By definition, a random variable X and its distribution are discrete if X assumes only
finitely many or at most countably many values Xl' -'"2' X3, .... called the possible values
of X, with positive probabilities PI = P(X = Xl), P2 = P(X = X2), P3 = P(X = X3), ... ,
whereas the probability P(X E 1) is zero for any interval 1 containing no possible value.
Clearly, the discrete distribution of X is also determined by the probability function
f(x) of X, defined by
p
(3)
f(x)
=
{
~
(j
=
1,2. ... ),
otherwise
From this we get the values of the distribution function F(x) by taking sums,
(4)
where for any given x we sum all the probabilities Pj for which Xj is smaller than or equal
to that of x. This is a step function with upward jumps of size Pj at the possible values
Xj of X and constant in between.
E X AMP L E 1
Probability Function and Distribution Function
Figure 512 shows the probability function f(x) and the distribution function F(x) of the discrete random variable
x=
Number a fair die tums up.
X has the possible values x = 1, 2, 3, 4, 5, 6 with probability 116 each. At these x the distribution function has
upward jumps of magnitude 1/6. Hence from the graph of flx) we can construct the graph of F(x), and conversely.
In Figure 512 (and the next one) at each jump the fat dot indicates theftmctioll value at the jump!
•
t(xl
Y6[ I I I I I I
o
5
x
o
5
10
12
x
12
x
F(x)
F(x)
1
30
36
20
36
1
2"
10
36
o
,
5
x
Fi-. 512. Probability function {(x)
and distribution function F(x) of the
random variable X = Number
obtained in tossing a fair die once
!
[
10
Fig. 513. Probability function {(x) and
distribution function F(x) of the random
variable X = Sum of the two numbers
obtained in tossing two fair dice once
SEC. 24.5
1013
Random Variables. Probability Distributions
E X AMP L E 2
Probability Function and Distribution Function
The random variable X = Slim of the two Illlll/bers t ....o fair dice tum "I' is discrete and has the possible values
2 (= 1 + II. 3.4..... 12 (= 6 + 6). There are 6' 6 = 36 equally likely outcomes (I. 1) (I. 2)....• (6. 6).
where the first number is lhat shown on the first die and the second number that on the other die. Each such
outcome has probability 1/36. Now X = 2 occurs in the case of the outcome O. I): X = 3 in the case of the
lwo outcomes (I. 2) and (2. I \: X = 4 in the case of the three outcomes (1. 3), (2. 2). (3. I): and so on. Hence
fix) = PIX = x) and F(x) = PIX ~ x) have the values
I
i
x
2
3
4
5
f(x)
1136
1136
2/36
3/36
3/36
6/36
I F(x)
6
7
8
9
10
II
12
4/36
5/36
15/36
6/36
21/36
5/36
26/36
4/36
10136
30136
3/36
33136
2136
35/36
1136
36/36
Figure 513 shows a bar chan of this function and the graph of the distribution funcllon. which is again a srep
function. with jumps (of different height!) at the possible values of X.
Two useful formulas for discrete distributions are readily obtained as follows. For the
probability cOlTesponding to intervals we have from (2) and (4)
(5)
P(a
< X~
=
b)
=
F(b) - F(a)
2:
Pj
(X discrete).
a<xJ~b
This is the sum of all probabilities Pj for which Xj satisfies a < Xj ~ b. (Be careful about
< and ~!) From this and peS) = I (Sec. 24.3) we obtain the following formula.
2: Pj =
(6)
(sum of all probabilities).
I
j
E X AMP L E 3
illustration of Formula (5)
In Example 2. compme [he probability of a sum of at least 4 and at
Solution.
E X AMP L E 4
P(3
< X ~ 8) = F(8) - F(3) = ~ -
mo~t
8.
•
:k = ~.
Waiting Time Problem. Countably Infinite Sample Space
In to~sing a fair coin. let X = Number of trials limit the first head appears. Then. by independence of events
(Sec. 24.3).
PiX
=
1)
= P(H)
_
1
=
~.~
-2
PIX = 2) = P(TH)
PIX = 3) = P(TTH) =
and in general PiX
series.
= II) =
(!)n.
II
! .~ .! = ~.
(H =
Head)
(T=
Tail)
etc.
= 1. 2•.... Also. (6) can be confirmed by the slim form lila for the geometric
1
2 +
1
4
1
+ 8 + ... = -I
+ 1-
~
=-1+2=1.
•
1014
CHAP. 24
Data Analysis. Probability Theory
Continuous Random Variables and Distributions
Discrete random variables appear in experiments in which we count (defectives in a
production, days of sunshine in Chicago. customers standing in a line, etc.). Continuous
random variables appear in experiments in which we measure (lengths of screws, voltage
in a power line, Brinell hardness of steel, etc.). By definition. a random variable X and
its distribution are of continuous type or, briefly. continuous, if its distribution function
F(x) [defined in (1)] can be given by an integral
(7)
=
F(x)
r
f(v) dv
-00
(we write v because x is needed as the upper limit of the integral) whose integrand f(x),
called the density of the distribution, is nonnegative, and is continuous, perhaps except
for finitely many x-values. Differentiation gives the relation of f to F as
(8)
f(x) = F'(x)
for every x at which f(x) is continuous.
From (2) and (7) we obtain the very important formula for the probability corresponding
to an interval:
(9)
Pta
<X~
b) = F(b) - F(a) =
I
b
f(v) dv.
a
This is the analog of (5).
From (7) and peS) = I (Sec. 24.3) we also have the analog of (6):
I
(10)
oo
f(v) dv = 1.
_00
Continuous random variables are simpler than discrete ones with respect to intervals.
Indeed, in the continuous case the four probabilities corresponding to a < X ~ b,
a < X < b, a ~ X < b, and a ~ X ~ b with any fixed a and b (> a) are all the same.
Can you see why? (Answer. This probability is the area under the density curve, as in
Fig. 514, and does not change by adding or subtracting a single point in the interval of
integration.) This is different from the discrete case! (Explain.)
The next example illustrates notations and typical applications of our present
formulas.
Curve of density
f(X)~
/
1~<bJ
a
Fig. 514.
b
x
Example illustrating formula (9)
SEC. 24.5
1015
Random Variables. Probability Distributions
E X AMP L E 5
Continuous Distribution
Let X have the density function I(x) = O.75( I - x 2 ) if -I ~ x ~ I and 7ero otherwi~e. Find the distribution
function. Find the probabilities P( -~ ~ X ;;; ~) and PC! ~ X ~ 2). Find x such that P(X ~ x) = 0.95.
Solutioll. From
(7)
F(x)
we obtain F(x) = 0 if x ~ -I,
= 0.75
IX
(I - v 2 ) dv
= 0.5 + 0.75x -
0.25x3
if-I
<x~
I,
-1
and
F( t)
= I if x >
l. From this and (9) we get
P{-~ ~ X ~~) =
F@ -
F{-~) =
0.75
I
1/2
(I - v
2
)
dv = 68.75%
-1/2
(because P( -~
~ X ~~) = P{ -~
< X ;;; ~) for a continuous distribution) and
P{~ ~ X ~ 2) =
F(2) - F(!) = 0.75
f
1
(1 -
v
2
)
dv = 31.64%.
1/4
(Note that the upper limit of integration is I, not 2. Why?) Finally,
P{X ~ x) = F(x) = 0.5
+
0.75x - 0.25,3 = 0.95.
3
Algebraic simplification gives 3x - x = 1.8. A solution is x = 0.73, approximately.
Sketch fet) and mark x = -~,~, !. and 0.73. so that you can see the results (the probabilities) as areas under
the curve. Sketch also Fex).
•
Further examples of continuous distributions are included in the next problem set and in
later sections.
-_·.·_.w·
........-...____
- ..... ,--.-----....,;. .....
y
_
....... . - . - - , _
h 2
(x = 1, 2, 3,4, 5; k suitable) and the distribution
function.
Graph the density function fIx) = b: 2 (Q ~ X ~ 5:
k suitable) and the distribution function.
(Uniform distribution) Graph f and F when the
density is f(x) = k = const if -4 ~ x ~ 4 and 0
elsewhere.
In Prob. 3 find P(O ~ x ~ 4) and c such that
P( -c < X < c) = 95%.
Graph f and F when f{ -2) = f(2) = 1/8,
f( -1) = f(1) = 3/8. Can f have further positive
values?
Graph the distribution function F(x) = I - e- 3x if
x > 0, F(x) = 0 if x ~ 0, and the density f(x). Find x
such that F(x) = 0.9.
Let X be the number of years before a particular type
of machine will need replacement. Assume that X has
the probability function f(l) = 0.1, f(2) = 0.2,
f(3) = 0.2, 1(4) = 0.2. 1(5) = 0.3. Graph I and F.
Find the probability that the machine needs no
1. Graph the probability function f(x)
2.
3.
4.
5.
6.
7.
=
replacement during the first 3 years.
8. If X has the probability function f(x) = k/2x
(x = O. l. 2 ... '). what are k and P(X ~ 4)?
9. Find the probability that none of the three bulbs in
a traffic signal must be replaced during the first 1200
hours of operation if the probability that a bulb must
be replaced is a random variable X with density
f(x) = 6[0.25 - (x - 1.5)2] when 1 ~ x ~ 2 and
fIx) = 0 otherwise. where x is time mea~ured in
mUltiples of 1000 hours.
10. Suppose that certain bolts have length L = 200 + X mm,
where X is a random variable with density
f(x) = ~(l - x 2 ) if - 1 ~ x ~ I and 0 otherwise.
Determine c so that with a probability of 95% a bolt
will have any length between 200 - c and 200 + c.
Hint: See also Example 5.
11. Let X [millimeters] be the thickne~~ of washers a
machine turns out. Assume that X has the density
f(x) = h if 1.9 < x < 2.1 and 0 otherwise. Find k.
What is the probability that a washer will have
thickness between 1.95 mm and 2.05 mm?
1016
CHAP. 24
Data Analysis. Probability Theory
that X is between ::!.5 (40% profit) and 5 (20% profit)?
12. Suppose that in an automatic process of filling oil
into cans, the content of a can (in gallons) is
Y = 50 + X. where X is a random variable with
density fIx) = I - Ixl when Ixl :=; 1 and 0 when
Ixl > I. Graph fIx) and F(x). In a lot of 100 cans, about
how many will contain 50 gallons or more'? What is
the probability that a can will contain less than 49.5
gallons? Less than 49 gallons?
13. Let the random variable X with density fIx) = ke-;c if
o ~ x :=; 2 and 0 otherwise (x = time measured in
years) be the time after which cel1ain ball bearings are
worn out. Find k and the probability that a bearing will
last at least I year.
14. Let X be the ratio of sales to profits of some fiml.
Assume that X has the distribution function F(x) = 0
if x < 2, F(x) = (x 2 - 4)/5 if 2 :=; x < 3. F(x) = I if
x ~ 3. Find and graph the density. What is the probability
15. Show that b < c implies P(X:=; b) :=; PIX :=; c).
16. If the diameter X of axles has the density fIx} = k if
119.9 :=; x ~ 120.1 and 0 otherwise, how many
defectives will a lot of 500 axles approximately contain
if defectives are axles slimmer than 119.92 or thicker
than 120.08?
17. Let X be a random variable that can as~ume evelY real
value. What are the complements of the events X ~ b.
X<~X~~X>~b:=;X~~b<X~~
18. A box contains 4 right-handed and 6 left-handed
screws. Two screws are drawn at random without
replacement. Let X be the number of left-handed screws
drawn. Find the probabilities PIX = m. PIX = I).
PIX = 2). P( I < X < 2), P(X :=; I). PIX ~ I).
PIX > I), and P(0.5 < X < 10).
24.6 Mean and Variance of a Distribution
The mean J-L and variance 0'2 of a random variable X and of its distribution are the theoretical
counterpalts of the mean x and variance S2 of a frequency distribution in Sec. 24.1 and
serve a similar purpose. Indeed, the mean characterizes the central location and the variance
the spread (the variability) of the distribution. The mean J-L (mu) is defined by
(a)
/1-
=
2: xjf(xj)
(Discrete distribution)
j
(1)
(b)
J-L
= {"
xf(x) dx
(Continuous distribution)
-x
and the variance 0'2 (sigma square) by
(a)
0'2 =
2: L~j -
J-L)2f(xj)
(Discrete distribution)
j
(2)
(b)
0'2 =
f"" (x -
J-L)2f{x) dx
(Continuous distribution).
-=
a (the positive square root of (T2) is called the standard deviation of X and its distribution.
f
is the probability function or the density, respectively, in (a) and (b).
The mean J-L is also denoted by E(X) and is called the expectation of X because it gives
the average value of X to be expected in many trials. Quantities such as J-L and 0'2 that
measure certain properties of a distribution are called parameters. J-L and 0'2 are the two
most important ones. From (2) we see that
(3)
(except for a discrete "distribution" with only one possible value, so that 0'2 = 0). We
assume that J-L and 0'2 exist (are finite), a<; is the ca<;e for practically all distributions that
are useful in applications.
SEC. 24.6
1017
Mean and Variance of a Distribution
E X AMP L E 1
Mean and Variance
The random variable X = Number o.{heads ina single toss o{ a lair coin has the possible values X = 0 and X = I
+ 1 = ~. and
with probabiliues PIX = 0) = ~ and PIX = 1) = From (1 a) we thus obtain the mean f.L =
(2a) yields the variance
o·l
i.
·l
•
E X AMP L E 1
Uniform Distribution. Variance Measures Spread
The distribution with the
den~ity
fIx) =
b -
a<x<b
if
1I
and.f = 0 otherwise is called the uniform distribution on [he interval a < x < h. From (I b) (or from Theorem
I. below) we find that f.L = (0 + b)l2. and (2b) yields the variance
I (. _(/
b
2 _
(T
-
a
.~
+ b)2
2
2
_1_
b-l/
._
d.\-
(b - l/)
12
•
Figure 515 illustrates that the spread is large if and only if (T2 is large.
((x)
{(x)
If-----,
o
1
(0
2
= 1/12)
x
1
o
-1
F(x)
2
x
2
x
F(x)
1
-1
Fig. 515.
Uniform distributions having the same mean (0.5) but different variances u'
Symmetry. We can obtain the mean JL without calculation if a distribution is symmetric.
Indeed, you may prove
THEOREM 1
Mean of a Symmetric Distribution
If a distribution is symmetric ~rith re!oJpect to x
thell JL = c. (Examples I and 2 illustrate this.)
= c, that is,
f(c - x) = f(c
+ x),
Transformation of Mean and Variance
Given a random variable X with mean JL and variance (]"2. we want to calculate the mean
and variance of X* = al + a 2X. where al and a2 are given constants. This problem is
important in statistics, where it appears often.
CHAP. 24
1018
THE 0 REM 2
Data Analysis. Probability Theory
Transformation of Mean and Variance
(a)
If a
random variable X has mean J.L and variance
(]"2,
then the random
variable
(4)
has the mean J.L* and variance
(]"*2.
where
(5)
and
(b) In particular, the standardized random variable Z corresponding to X,
given by
X-J.L
Z=--
(6)
(]"
has the mean 0 and the variance 1.
PROOF
We prove (5) for a continuous distribution. To a small interval I of length .it" on the
x-axis there corresponds the probability f(X)flx [approximately; the area of a rectangle of
base .i,. and height f(x)]. Then the probability f(x).::lx must equal that for the corresponding
interval on the x*-axis, that is, f*(x*)Llx*, where f* is the density of X* and Llx* is the
length of the interval on the x*-axis corresponding to l. Hence for differentials we have
f*(x*) dx* = f(x) dx. Also. x* = al + a2x by (4). so that (lb) applied to X* gives
J.L* =
I"" x*f*(x*) dx*
-co
=
I
x
(al
+
a2x )f(x) dx
-x
=
al
{Xl f(x) dx + a2 I= xf(x) dx.
-x
-CD
On the right the first integral equals 1, by (10) in Sec. 24.5. The second integral is J.L. This
proves (5) for J.L *. It implies
From this and (2) applied to X*, again using f*(x*) dx* = f(x) dx. we obtain the second
formula in (5),
(]"*2
=
IX (x* -~
J.L*)21*(x*) dx*
=
a 22
!"C (x -
J.L)2f(x) dx
=
a2 2(]"2.
-x
For a discrete distribution the proof of (5) is similar.
Choosing a l = - J.LI(]" and a2 = I/(]" we obtain (6) from (4), writing X*
{II, a2 formula (5) gives J.L* = 0 and (]"*2 = 1, as claimed in (b).
= Z. For these
•
SEC. 24.6
1019
Mean and Variance of a Distribution
Moments
Expectation,
Recall that (1) defines the expectation (the mean) of X, the value of X to be expected on
the average, written IL = E(X). More generally, if g(x) is non constant and continuous for
all x, then g(X) is a random variable. Hence its mathematical expectation or, briefly, its
expectation E(g(X» is the value of g(X) to be expected on the average. defined [similarly
to (I)] by
(7)
E(g(X»
=
2: g(.l)f(xj)
E(g(X»
or
=
fC
g(x)f(x) dx.
-x
j
In the first fonnula, f is the probability function of the discrete random variable X. In the
second formula, f is the density of the continuous random variable X. Important special
cases are the kth moment of X (where k = L 2.... )
(8)
E(Xk) =
2: x/f(xj)
or
j
and the kth central moment of X (k
(9)
E([X - IL]k)
=
2: (Xj
=
L, 2, ... )
or
- ILlf(X.i)
{Xl (x _ ILlf(x) dx.
-x
j
This includes the first moment. the mean of X
(10)
IL
=
E(X)
[(8) with k
= I].
It also includes the second central moment, the variance of X
[(9) with k = 2].
(11)
For later use you may prove
(12)
11-:.6J
E(l)
MEAN, VARIANCE
Find the mean and the variance of the random variable X
with probability function or density f(x).
1. f(x)
= 2x
(0;:;; x ;:;; I)
2. f(O) = 0.512,
f(3) = 0.008
f(l) = 0.384,
f(2) = 0.096,
3. X = Number a fair die turns up
4. Y = -4X + 5 with X as in Prob. 1
5. Uniform distribution on [0, 8]
6. f(x) = 2e- 2x (x ~ 0)
= 1.
7. What is the expected daily profit if a store sells X air
conditioners per day with probability f(lO) = 0.1,
fell) = 0.3, f(12) = 0.4, f(13) = 0.2 and the profit
per conditioner is $55?
8. What is the mean life of a light bulb whose life X [hours]
has the density f(x) = O.OOle- O.OOlx (x ~ O)?
9. If the mileage (in multiples of 1000 mi) after which a tire
must be replaced is given by the random variable X with
density f(x) = ()e- flx (x > 0). what mileage can you
expect to get on one of these tires? Let = 0.04 and find
the probability that a tire will last at least 40000 mi.
e
1010
CHAP. 24
Data Analysis. Probability Theory
10. What sum can you expect in rolling a fair die 10 times?
Do it. Repeat this experiment 20 times and record how
the sum varies.
11. A small filling station is supplied with gasoline every
Saturday afternoon. Assume that its volume X of sales
in ten thousands of gallons has the probability density
f(x) = 6x( I - x) if 0 ~ x ~ 1 and 0 otherwise.
Determine the mean. the vaIiance. and the standardized
variable.
12. What capacity must the tank in Prob. II have in order
that the probability that the tank will be emptied in a
given week be 5%'1
13. Let X [cm] be the diameter of bolts in a production.
Assume that X has the density
f(x) = k(x - 0.9)( 1.1 - x) if 0.9 < x < 1.1 and 0
otherwise. Detern1ine k. sketch fIx). and find JL and u 2 .
14. Suppose that in Prob. 13. a bolt is regarded as being
defective if its diameter deviates from 1.00 cm by more
than 0.09 cm. What percentage of defective bolts
should we then expect?
15. For what choice of the maximum possible deviation c
from 1.00 cm shall we obtain 3o/c defectives in Probs.
13 and 14?
24.7
16. TEAM PROJECT. Means, Variances, Expectations.
(a) Show that E(X - IL) = 0, (]"2 = E(X 2 ) - IL2.
(b) Prove (10)-(12).
(c) Find all the moments of the uniform distribution
on an interval a ~ x ~ b.
(d) The skewness 'Y of a random variable X is defined
by
(13)
'Y = 3
1
3
E([X - ILl ).
(]"
Show that for a symmetric distribution (whose third
central moment exists) the skewness is 7ero.
(e) Find the skewness of the distribution with density
fIx) = xe- x when x > 0 and fIx) = 0 otherwise.
Sketch f(x).
(I) Calculate the skewness of a few simple discrete
distributions of your own choice.
(g) Find a non symmetric discrete distribution with 3
possible values. mean O. and skewness O.
Binomial, Poisson, and Hypergeometric
D istri butions
These are the three most important discrete distributions. with numerous applications.
Binomial Distribution
The binomial distribution occurs in games of chance (rolling a die, see below, etc.),
quality inspection (e.g., counting of the number of defectives), opinion polls (counting
number of employees favoring certain schedule changes, etc.), medicine (e.g., recording
the number of patients recovered by a new medication), and so on. The conditions of its
OCCUlTence are as follows.
We are interested in the number of times an event A occurs in n independent trials. In
each trial the event A has the same probability P(A) = p. Then in a trial, A will not occur
with probability q = I - p. rn 11 trials the random variable that interests us is
x
=
Number oj times the event A occurs in
11
trials.
X can assume the values 0, I. .... 11. and we want to detennine the corresponding
probabilities. Now X = x means that A occurs in l" trials and in n - x trials it does not
occur. This may look as follows.
SEC. 24.7
1021
Binomial, Poisson. and Hypergeometric Distributions
A
A··· A
B
'---v----'
(1)
B··· B.
~
x times
11 -
x times
Here B = A C is the complement of A, meaning that A does not Occur (Sec. 24.2). We now
use the assumption that the trials are independent, that is. they do not influence each other.
Hence (I) has the probability (see Sec. 24.3 on independent events)
pp ... p
. qq'"
'-------v-----'
(1 *)
x times
q
=
p"·qn-x.
'-------v-----'
x times
11 -
Now (l) is just one order of ananging x A's and 11 - x B's. We now use Theorem l(b)
in Sec. 24.4, which gives the number of permutations of 11 things (the 11 outcomes of the
II trials) consisting of 2 classes, class I containing the III = x A's and class 2 containing
the 11 - III = 11 - x B's. This number is
n!
x!(n - x)! = (:) .
Accordingly, (1 *) multiplied by this binomial coefficient gives the probability P(X = x) of
X = x, that is, of obtaining A precisely x times in 11 trials. Hence X has the probability function
(\' = 0, I, ... , /l)
(2)
and I(x) = 0 otherwise. The distribution of X with probability function (2) is called the
binomial distribution or Be1'1loll11i distriblltion. The occunence of A is called sllccess
(regardless of what it actually is: it may mean that you miss your plane or lose your watch)
and the nonoccunence of A is calledfailllre. Figure 516 shows typical examples. Numeric
values can be obtained from Table A5 in App. 5 or from your CAS.
The mean of the binomial distribution is (see Team Project 16)
(3)
J.L
= np
and the variance is (see Team Project 16)
(4)
(]'2
=
npq.
0.5
Fig. 516.
°O!:---'--'-~----:C5
O!--,--'-...L~5
p=O.l
p=O.2
jll
O:--,--'-...L.1....:!5
0~~-'-.......,.5
p=O.5
p=O.8
p =0.9
0
5
Probability function (2) of the binomial distribution for n = 5 and various values of p
1022
CHAP. 24
Data Analysis. Probability Theory
For the symmetric case of equal chance of success and failure (p
the mean nl2, the variance nJ4, and the probability function
(2*)
E X AMP L E 1
f(x)
= (:)
=
q
(~r
= 112) this gives
(x = 0, 1, ... , n).
Binomial Distribution
Compute the probability of obtaining at least two "Six" in rolling a fair die 4 times.
Solution.
p = peA) = P("Six") = 1/6, q
2 or 3 or 4 "Six." Hence the answer is
P = J(2)
+
J(3)
+
J(4) =
(~)
=
5/6. n
= 4. The event "At leasr two
r(r
'Six'" occurs if we obtain
r(
(i % + (~) (i %) + (:) (i
I
= - 4 (6·25 + 4·5 +
6
I)
r
•
171
= - - = 13.2%.
1296
Poisson Distribution
The discrete distribution with infinitely many possible values and probability function
(5)
(x = 0, 1, . , .)
f(x) =
is called the Poisson distribution, named after S. D. Poisson (Sec, 18.5). Figure 517
shows (5) for some values of f.L. It can be proved that this distribution is obtained as a
limiting case of the binomial distribution, if we let p ~ and n ~ 00 so that the mean
f.L = np approaches a finite value. (For instance, f.L = np may be kept constant.) The
Poisson distribution has the mean f.L and the variance (see Team Project 16)
°
(6)
Figure 517 gives the impression that with increasing mean the spread of the distribution
increases, thereby illustrating formula (6), and that the distribution becomes more and
more (approximately) symmetric.
0.5
o
11 = 0.5
Fig. 517.
11=1
11=2
5
10
11=5
Probability function (S) of the Poisson distribution for various values of J1,
SEC. 24.7
1023
Binomial, Poisson. and Hypergeometric Distributions
E X AMP L E 2
Poisson Distribution
If the probability of producing a defective screw is I' = 0.01. what is the probability that a lot of 100
will contain more than 2 defectives'!
screw~
Solutioll.
The complementary event is AC : Not more thaI! 2 defeethoes. For its probability we get from the
binomial distribution with mean M = 111' = I the value [see (2)]
P(A c ) =
(
2
98
100)
0
(J.YY lOO + (100)
I
(J.OI • 0.yy99 + ( 100)
2 0.01' 0.Y9 .
Since p is very small. we can approximale this by the much more conveniem Poisson distriblllion with mean
J1. = 111' = 100· 0.01 = I. obtaining [see (5)]
=
91.97%.
Thu, PIA) = 8.03%. Show that the binomial distribution gives PIA) = 7.94%, so that the
is quite good.
E X AMP L E 3
Poi~son
approximation
•
Parking Problems. Poisson Distribution
If on the average. 2 cars enter a certain parking lot per minute. what is the probability that during any given
minute 4 or more car, will enter the lot?
Solutioll.
To understand that the Poisson distribution is a model of the situation. we imagine the minute to
be divided into very many short time intervals, let p be the (constant) probability that a car will enter the lot
during any such short interval. and assume independence of the events that happen during those imervals. Then
we are dealing with a binomial distribution with very large 11 and very small 1', which we can approximate by
the Pois~on distriblllion with
M = 1117 = 2,
because 2 cars enter on the average. The complementary event of the event ·'4 cars or more during a given
minute" is "3 cars orfewer elller the lot" and ha~ the prohability
0
f(O)
+
f(l)
+
f(2)
+
f(3) = e-
=
Answer: l·t3"k.
2
2
( -0
1
21
+ -
I!
22
+ -
2!
3
2 )
+ -
3!
0.857.
•
(Why did we consider that complement"!)
Sampling with Replacement
This mean, that we draw things from a given set one by one, and after each trial we
replace the thing drawn (put it back to the given set and mix) before we draw the next
thing. This guarantees independence of trials and leads to the binomial distribution.
Indeed, if a box contains N things, for example. screws. M of which are defective, the
probability of drawing a defective screw in a trial is p = MIN. Hence the probability of
drawing a nondefective screw is q = 1 - p = 1 - MIN, and (2) gives the probability of
drawing x defectives in n trials in the form
(7)
_ (11) (M)X
(I - M)n-X
-
f(x) -
x
N
N
(x
= 0,
I, ... ,
11).
1024
CHAP. 24
Data Analysis. Probability Theory
Sampling without Replacement.
Hypergeometric Distribution
Sampling without replacement means that we return no screw to the box. Then we no
longer have independence of trials (why?). and instead of (7) the probability of drawing
x defectives in n trials is
(x
I(x) =
(8)
=
0, I, ... ,
11).
The distribution with this probability function is called the hypergeometric distribution
(because its moment generating function (see Team Project 16) can be expres~ed by the
hypergeometric function defined in Sec. 5.4, a fact that we shall not use).
Derivation of (8). By (4a) in Sec. 24.4 there are
(a) (:) different ways of picking
11
things from N.
(b)
(~)
(c)
(Nn -- M\
different ways of picking n xJ
different ways of picking x defectives from M,
x nondefectives from
N- M,
and each way in (b) combined with each way in (c) gives the total number of mutually
exclusive ways of obtaining x defectives in 11 drawings without replacement. Since (a) is
the total number of outcomes and we draw at random. each such way has the probability
l/(Ij . From this, (8) follows.
•
The hypergeometric distribution has the mean (Team Project 16)
M
(9)
J.L=/l-
N
and the variance
nM(N - MeN - 11)
(10)
E X AMP L E 4
N 2 (N -
I)
Sampling with and without Replacement
We want to draw random samples of two gaskets from a box containing 10 gaskets. three of which are defective.
Find the probability function of the random variable X = Number of defectives ill the sample.
Solution.
We have N = 10. M = 3. N - M = 7.
f(x)
=
(~) I~
(
r( r7
10
Il
= 2. For sampling with replacement. (7) yields
x
J(O) = 0.49,
f(l)
= 0.42, J(2) = 0.09.
For sampling without replacement we have to use (8), finding
21
fCO) = f(l) =
45 "" 0.47.
3
f(2) = 45 = 0.07.
•
SEC. 24.7
1025
Binomial, Poisson, and Hypergeometric Distributions
If N, M, and N - M are large compared with n, then it does not matter too much whether
we sample with or without replacemellt, and in this case the hypergeometric distribution
may be approximated by the binomial distribution (with p = MIN), which is somewhat
simpler.
Hence in slimp ling from an indefinitely large population ("infinite population") we
1Il11\' use the binomial distribution, regardless of whether we sample with or withollI
replacement.
===== -... -...-.. -..
:
..
..".-~
...
--
1. Four fair coins are tossed simultaneously. Find the
probability function of the random variable X = Number
of heads and compute the probabilities of obtaining no
heads, precisely I head, at least I head, not more than
3 heads.
2. If the probability of hitting a target in a single shot is
10% and 10 shots are fired independently. what is the
probability that the target will be hit at least once?
3. In Prob. 2, if the probability of hitting would be 5%
and we fired 20 shots. would the probability of hitting
at least once be less than. equal to, or greater than in
Prob. 2? Guess first, then compute.
4. Suppose that 3% of bolts made by a machine are
defective, the defectives occurring at random during
production. If the bolts are packaged 50 per box,
what is the Poisson approximatIOn of the probabilit)
that a given box will contain x = 0, 1, ... , 5
defeClives?
5. Let X be the number of cars per minute passing a certain
point of some road between 8 A.M. and 10 A.M. on a
Sunday. Assume that X has a Poisson disuibution with
mean 5. Find the probability of observing 3 or fewer
cars during any given minute.
6. Suppose thar a telephone switchboard of some
company on the average handles 300 calls per hour,
and that the board can make at most 10 connections
per minute. Using the Poisson disU'ibution, estimate the
probability that the board will be overtaxed during a
given minute. (Use Table A6 in App. 5 or your CAS.)
7. (Rutherford-Geiger experiments) In 1910. E.
Rutherford and H. Geiger showed experimenrally that
the number of alpha particles emitted per second in a
radioactive process is a random variable X having a
Poisson distribution. If X has mean 0.5, whar is the
probability of observing two or more particles during
any given second?
8. A process of manufacturing screws is checked every
hour by inspecting 11 screws selected at random from
that hour's production. If one or more screws are
defective, the process is halted and carefully examined.
How large should n be if the manufacturer wants the
probability to be about 95% that the process will be
haIted when 10% of the screws being produced are
defective? (Assume independence of the quality of any
screw of that of the other screws.)
9. Suppose that in the production of 50-n resistors,
nondefective items are those that have a resistance
between 45 nand 55 n and the probability of a
resistor's being defective is 0.2%. The resistors are sold
in lots of 100, with the guarantee that all resistors are
nondefective. What is the probability that a given lot
will violate this guarantee? (Use the Poisson
distribution.)
10. Let P = lo/c be the probability that a certain type of
lightbulb will fail in a 24-hr test. Find the probability
that a sign consisting of 10 such bulbs will bum 24
hours with no bulb failures.
11. Guess how much less the probability in Prob. 10 would
be if the sign consisted of 100 bulbs. Then calculate.
12. Suppose that a certain type of magnetic tape contains.
on the averdge, 2 defects per 100 meters. What is the
probability that a roll of tape 300 meters long will
contain (a) x defects, (b) no defects?
13. Suppose thar a test for extrasensory perception consists
of naming (in any order) 3 cards randomly drawn from
a deck of 13 cards. Find the probability that by chance
alone, the person will correctly name (a) no cards,
(b) I card, (c) 2 cards, (d) 3 cards.
14. A carton contains 20 fuses, 5 of which are defective.
Find the probability that. if a sample of 3 fuses is
chosen from the carton by random drawing without
replacement, x fuses in the sample will be defective.
15. (Multinomial distribution) Suppose a trial can result in
precisely one of k mutually exclusive events Ab ... , Ak
with probabilities PI' .•• , Pk' respectively, where
PI + ... + Pk = 1. Suppose that /I independent trials
are performed. Show that the probability of getting
Xl AI's, . . • , Xk Ak's is
where
Xl
0
~ Xj ~ II,
+ ... + Xk
= 11.
j = I, ... , k. and
The distribution having this
CHAP. 24
1026
Data Analysis. Probability Theory
(b) Shov. that the binomial distribution has the
moment generating function
probability function is called the lIlultinomial
distribution.
16. TEAM PROJECT. Moment Generating Function.
The moment generating function G(t) is defined by
~
tX
tx·
G(t) = E(e J) = .L.J e 'f(xj)
= (pet
+ q)".
or
G(t)
=
E(etx )
=
(c) Using (b), prove (3).
fX et·t:f(x) dx
(d)
-x
where X is a discrete or continuous random variable,
respectively.
(a) Assuming that termwise differentiation and
differentiation under the integral sign are permissible,
show that E(Xk) = dkl(O), where d k ) = dkG/dtk. in
particular, /L = G' (0).
24.8
Prove (4).
(e) Show that the Poisson distribution has the moment
generating function G{t) = e-lLe lLe ' and prove (6).
(f)
Prove x
(~)
Using this. prove
= M
(~ =!) .
(9).
Normal Distribution
Turning from discrete to continuous distributions, in this section we discuss the normal
distribution. This is the most important continuous distribution because in applications
many random variables are normal random variables (that is, they have a normal
distribution) or they are approximately normal or can be transformed into normal random
variables in a relatively simple fashion. Furthermore, the normal distribution is a useful
approximation of more complicated distributions. and it also occurs in the proofs of various
statistical tests.
The normal distribution or Gauss distribution is defined as the distribution with the
density
(1)
f(x) = -I- exp [_ 21
d\!2;
(x -cr )2J
J.L
(cr> 0)
where exp is the exponential function with base e = 2.718 .... This is simpler than it
may at first look. fex) has these features (see also Fig. 518).
1. J.L is the mean and cr the standard deviation.
2. l/(dV2;) is a constant factor that makes the area under the curve of f(x) from
to x equal to L as it must be by (0), Sec. 24.5.
-x
3. The curve of f(x) is symmetric with respect to x = J.L because the exponent is
quadratic. Hence for J.L = 0 it is symmetric with respect to the y-axis x = 0
(Fig. 518, "bell-shaped curves").
4. The exponential function in (1) goes to zero very fast-the faster the smaller the
standard deviation cr is, as it should be (Fig. 518).
SEC. 24.8
1027
Normal Distribution
((x)
cr = 1.0
2
Fig. 518.
x
Density (1) of the normal distribution with /L = 0 for various values of u
Distribution Function F(x)
From (7) in Sec. 24.5 and (I) we see that the normal di<;tribution has the distribution
function
(2)
.~
IX
27T
Fex) =
(TV
exp [- 21 (u - J.L
(T
-00
)2J du.
Here we needed x as the upper limit of integration and wrote u (instead of x) in the integrand.
For the conesponding standardized normal distribution with mean 0 and standard
deviation I we denote F(x) by <P(.:). Then we simply have from (2)
I
Vh
I
(3)
Z
e- u2/2 duo
cI>(.:) = - -
-x
This integral cannot be integrated by one of the methods of calculus. But this is no serious
handicap because its values can be obtained from Table A 7 in App. 5 or from your CAS.
These values are needed in working with the normal distribution. The curve of cI>(z) is
S-shaped. It increases monotone (why?) from 0 to 1 and intersects the vertical axis at
112 (why?), as shown in Fig. 519.
Relation Between F(x) and «I>(z). Although your CAS will give you values of F(x) in
(2) with any J.L and (T directly, it is important to comprehend that and why any such an
F(x) can be expressed in terms of the tabulated standard cI>(z), as follows.
y
,",(xl
~
1.0
0.8
/
0.6
o.
/0.2
-3
Fig. 519.
-2
-1
0
2
3
x
Distribution function <I>(z) of the normal distribution with mean 0 and variance 1
1028
CHAP. 24
THE 0 REM 1
Data Analysis. Probability Theory
Use of the Normal Table A7 in App. 5
r
The distributioll fimctioll FCr) of the nonnal distriblltioll with allY J.L and 0" see (2)]
is related to the stalldardi-;.ed distribution filllCtiOIl <1>(.:) ill (3) by the formula
(4)
PROOF
F(x)
= <I> ( -x-J.L)
0"-
.
Comparing (2) and (3) we see that we should set
u=
Then v
= x gives
u=
0"
0"
as the new upper limit of integration. Also v - J.L
0" drops out.
I
I(X-P)/u
u2 2
F(x)
= --O"~
e-
/
=
0"
O"U,
du
thus dv
=
0" du.
Together, since
X )
= <I> ( _
_J.L_
0"
-ex:
•
Probabilities corresponding to intervals will be needed quite frequently in statistics in
Chap. 25. These are obtained as follows.
THEOREM 2
Normal Probabilities for Intervals
The probability that a normal random variable X with mean J.L and standard
del'iatioll 0" assume allY vallie ill all inten'al a < x ~ b is
(5)
PROOF
Pea
<
X
~
b) = F(b) - F(a) = <I> ( -b-J.L)
0 " - - <J:> (a-J.L)
-0"- .
.
Formula (2) in Sec. 24.5 gives the first equality in (5). and (4) in this section gives the
~~~~~.
Numeric Values
In practical work with the normal disnibution it is good to remember that about 2/3 of all
values of X to be observed will lie between J.L ± 0", about 95% between J.L ± 20", and practically
all between the three-sigma limits J.L ± 30". More precisely, by Table A7 in App. 5,
(6)
<X
(a)
P(J.L - 0"
(b)
P(J.L - 20"
<
(c)
P(J.L - 30"
<X
~
J.L
X ~ J.L
~ J.L
+
0") = 68%
+ 20") =
+
95.5%
30") = 99.7%.
Formulas (6a) and (6b) are illustrated in Fig. 520.
The formulas in (6) show that a value deviating from JL by more than 0", 20", or 30" will
occur in one of about 3, 20, and 300 trials, respectively.
SEC. 24.8
1029
Normal Distribution
2.25%
-\
r
.u-20
(a)
95.5%
'/
2.25%
'tl
!
.u
.u+20
(b)
Fig. 520.
Illustration of formula (6)
In tests (Chap. 25) we shall ask conversely for the intervals that corre<;pond to certain
given probabilities; practically most important are the probabilities of 95%, 99%, and
99.9%. For these, Table AS in App. 5 gives the answers J.L ± 2u, J.L ± 2.5u, and
J.L ± 3.3u, respectively. More precisely,
(7)
(a)
P(J.L - 1.96u
<X
~
J.L
+
1.96u)
= 95%
(b)
P(J.L - 2.5Su
<X
~
J.L
+
2.5Su)
= 99%
(e)
P(J.L - 3.29u
<X
~
J.L
+
3.29u) = 99.9%.
Working With the Normal Tables A7 and AS in App. 5
There are two normal tables in App. 5, Tables A7 and A8. If you want probabilities, use
Table A7. If probabilities are given and corresponding intervals or x-values are wanted,
use Table AS. The following examples are typical. Do them with care. verifying all values,
and don't just regard them as dull exercises for your software. Make sketches of the density
to see whether the results look reasonable.
E X AMP L E 1
Reading Entries from Table A7
If X is standardi7ed nonnal (so that /L = O. a = 1), then
P(X ~ 2.'14) = 0.9927 = 99~%
P(X~
P(X
~
PO.O
E X AMP L E 2
-1.16)
= 1 - <1>(1.16) = 1 - 0.8770 = 0.1230 = 12.3'J1:
1) = 1 - P(X
~
X
~
~
I)
=
1 - 0.H413 = 0.1587 by (7), Sec. 24.3
1.8) = <1>(1.8) - <1>(1.0) = 0.9641 - 0.8413 = 0.1228.
•
Probabilities for Given Intervals, Table A7
Let X be normal with mean 0.8 and variance 4 (so that a = 2). Then by (4) and (5)
P(X ~ 2.44)
=
F(2.44)
= q) (
2.44 - 0.80 )
2
= <1>(0.82) = 0.7939
= 80%
or if you like it better (similarly in the other cases)
P(X ~ 2.44)
PIX ~ 1)
=
=
P(
2.44 - 0.80 )
X - 0.80
2
~
2
=
1-08)
I - P(X ~ 1) = I - <I> ( --2-'-
P(1.0 ~ X ~ 1.8) = <1>(0.5) - <1>(0.1)
=
P(Z ~ 0.82)
=
= 0.7939
I - 0.5398 = 0.4602
0.6915 - 0.5398 = 0.1517.
•
1030
E X AMP L E 3
CHAP. 24
Data Analysis. Probability Theory
Unknown Values c for Given Probabilities, Table AS
Let X be nonnal with mean 5 and variance 0.04 (hence standard deviation 0.1). Find c or k corresponding to
the given probability
PIX
~ c) =
P(5 - k
P(X
E X AMP L E 4
~
950/(.
X
~ c) =
~
19C.
5)
c -
<P ( 0:2
=
5 + k) = 90%,
thus PIX
c - 5
0.1 = 1.645.
95%.
5 + k = 5.319
~ c) =
(as before: why?)
c-5
0:2
99%.
c = 5.319
=
c = 5.46S.
2.326.
•
Defectives
In a production of iron rods let the diameter X be nunnally distributed with mean 2 in. ilnd standard deviation
0.008 in.
(a) What percentage of defectives can we expect if we ,et the tolerance limit, at 2 ::':: 0.01 in.?
(b) How should we set the tolerance limits to allow for 4'lt defectives?
Solutioll.
(a) I!'k because from (S) and Table A7 we obtain for the complementary event the probability
P(
1.98 ~ X ~ 2.02) = <I) (
( 1.9H - 2.00 )
2.02 - 2.00)
0.008
- <I)
0.008
=
<1)(2.5) - <P( -2.S)
=
0.9938 - (I - 0.9938)
=
0.9876
= 98~%.
(b) 2 ::':: 0.0164 because for the complementary event we have
0.96
=
P(2 - c ~ X ~ 2
+ c)
or
0.98 = P(X
~
2 +
c)
so that Table A8 gives
0.98 = <P (
1+C-2)
0
.
0.08
2+c-2
0.008
=
2.0S4.
•
c = 0.0164.
Normal Approximation of the Binomial Distribution
The probability function of the binomial distribution is (Sec. 24.7)
(8)
(x
= 0, 1, ... , n).
If 11 is large, the binomial coefficients and powers become very inconvenient. It is of great
practical (and theoretical) importance that in this case the normal distribution provides a
good approximation of the binomial distribution, according to the following theorem, one
of the most important theorems in all probability theory.
SEC. 24.8
1031
Normal Distribution
THEOREM 3
Limit Theorem of De Moivre and Laplace
For large n.
f(x) ~ f*(x)
(9)
Here
f
(x
= 0, 1, ... , n).
is given by (8). The function
(10)
f* (x) =
x - np
---c=---==
V2;v,;pq
v;pq
is the density of the n01711al distribution with mean J.L = np and variance (j2 = npq
(the mean and variance of the billomial distribution). The symbol ~ (read
asymptotically equal) means that the ratio of both sides approaches 1 as n
approaches so. Flirthe17nore. fbr any llOllIlegative integers a and b (> a).
Pea
~ X ~ b) = ~a (:) p~:qn-x ~ cI>(f3) -
cI>(ex),
(11 )
ex=
a - IIp - 0.5
v,;pq
13=
b - I1p + 0.5
v,;pq
A proof of this theorem can be found in [03] listed in App. 1. The proof shows that the
term 0.5 in ex and 13 is a correction caused by the change from a discrete to a continuous
distribution.
11-131
NORMAL DISTRIBUTION
1. Let X be normal with mean 80 and variance 9.
Find P(X > 83). P(X < 81), P(X < 80), and
P(78 < X < 82).
2. Let X be normal with mean 120 and variance 16. Find
P(X;;;; 126), P(X > 116), P(l25 < X < 130).
3. Let X be normal with mean 14 and variance 4. Determine
c such that P(X ;;;; c) = 95%, P(X ;;;; c) = 5%,
P(X;;;; c) = 99.5%.
under the guarantee?
6. If the standard deviation in Prob. 5 were smaller.
would that percentage be smaller or larger?
7. A manufacturer knows from experience that the
resistance of resistors he produces is normal with mean
/L = 150 n and standard deviation (T = 5 n. What
percentage of the resistors will have resistance between
148 nand 152 n? Between 140 nand 160 n?
4. Let X be normal with mean 4.2 and variance 0.04.
Find c such that P(X ;;;; c) = 50%, P(X > C) = 10%,
P(-c < X - 4.2;;;; c) = 99%.
8. The breaking strength X [kg] of a certain type of
plastic block is normally distributed with a mean of
1250 kg and a standard deviation of 55 kg. What is
the maximum load such that we can expect no more
than 5% of the blocks to break?
5. If the lifetime X of a certain kind of automobile
battery is normally distributed with a mean of 4 yr
and a standard deviation of I yr, and the manufacturer
wishes to guarantee the battery for 3 yr, what
percentage of the batteries will he have to replace
9. A manufacturer produces airmail envelopes whose
weight is normal with mean /L = 1.950 grams and
standard deviation if = 0.025 grams. The envelopes
are sold in lots of 1000. How many envelopes in a lot
will be heavier than 2 grams?
1032
CHAP. 24
Data Analysis. Probability Theory
10. If the resistance X of ce11ain wires in an electrical
network is normal with mean 0.01 D and standard
deviation 0.001 D, how many of 1000 wires will meet
the specification that they have resistance between
0.009 and 0.011 Q?
(d) Considering cI>2(ao) and introducing polar
coordinates in the double integral (a standard trick
worth remembering), prove
(12)
11. If the mathematics scores of the SAT college entrance
exams are normal with mean 480 and standard
deviation 100 (these are about the actual values over
the past years) and if some college sets 500 as the
minimum score for new students, what percent of
students will not reach that score?
1
~
\. 2'11"
IX e-
ll 2
12
dll
= 1.
-x
(I) Bernoulli's law oflarge nwnbers.ln an experiment
let an event A have probability p (0 < P < I), and let
X be the number of time~ A happens in /I independent
trials. Show that for any given E > 0,
as
/1-+
x.
(g) Transformation. If X is normal with mean J.L
and variance u 2 , show that X* = clX + C2 (cI > 0)
is normal with mean J.L* = CIJ.L + C2 and variance
U*2
= C1 2 U 2 .
15. WRITING PROJECT. Use of Tables. Give a
systematic discussion of the use of Tables A7 and A8
for obtaining P(X < b), P(X > a), P(a < X < fA
PIX < c) = k. P(X > c) = k. as well as
P(J.L - C < X < J.L + c) = k: include simple examples.
If you have a CAS. describe to what extent it makes
the use of those tables superfluous; give examples.
14. TEAM PROJECT. Normal Distribution. (a) Derive
the formula~ in (6) and (7) from the appropriate normal
table.
(b) Show that cI>(-:) = I - cI>(:). Give an example.
(e) Find the points of inflection of the curve of (1).
24.9
=
(e) Show that u in (1) is indeed the standard deviation
of the normal distribution. [Use (12).]
12. If the monthly machine repair and maintenance cost X
in a ce11ain factory is known to be normal with mean
$12000 and standard deviation $2000, what is the
probability that the repair cost for the next month will
exceed the hudgeted amount of $150007
13. [f sick-leave time X used by employees of a company
in one month is (very roughly) normal with mean 1000
hours and standard deviation 100 hours. how much
time t should be budgeted for sick leave during the next
month if t is to be exceeded with probability of only
20o/c?
cI>(x)
Distributions of Several Random Variables
Distributions of two or more random variables are of interest for two reasons:
1. They occur in experiments in which we observe several random variables, for
example, carbon content X and hardness Y of steel, amount of fertilizer X and yield of
corn Y, height Xl' weight X 2 , and blood pressure X3 of persons, and so on.
2. They will be needed in the mathematical justification of the methods of statistics in
Chap. 25.
In this section we consider two random variables X and Yor, as we also say, a twodimensional random variable (X, Y). For (X, Y) the outcome of a trial is a pair of numbers
X = x, Y = y, briefly (X, Y) = (x, y), which we can plot as a point in the XY-plane.
The two-dimensional probability distribution of the random variable (X, Y) is given
by the distribution function
(1)
F(:.:, y)
=
P(X ~
x,
Y ~ y).
This is the probability that in a trial, X will assume any value not greater than x and in
the same trial, Y will assume any value not greater than y. This corresponds to the blue
region in Fig. 521, which extends to -00 to the left and below. F(x, y) determines the
SEC. 24.9
1033
Distributions of Several Random Variables
Fig. 521.
Formula (1)
probability distribution uniquely, because in analogy to formula (2) in Sec. 24.5, that is,
< X ~ b) = F(b) - F(a), we now have for a rectangle (see Prob. 14)
Pea
As before, in the two-dimensional case we shall also have discrete and continuous
random variables and distributions.
Discrete Two-Dimensional Distributions
In analogy to the case of a single random variable (Sec. 24.5), we call (X, Y) and its
distribution discrete if (X, Y) can assume only finitely many or at most countably infinitely
many pairs of values (XI' YI), (X2' )'2), '" with positive probabilities, whereas the
probability for any domain containing none of those values of (X, Y) is zero.
Let (x;, -':) be any of those pairs and let P(X = Xi' Y = Yj) = Pij (where we admit that
Pij may be 0 for certain pairs of subscripts i, j). Then we define the probability function
f(x. y) of ex, Y) by
(3)
f(x, y) = Pij
if x = Xi, Y = Yj
f(x, y) = 0
and
otherwise;
here, i = 1,2, ... andj = 1,2, ... independently. In analogy to (4), Sec. 24.5, we now
have for the distribution function the formula
(4)
Instead of (6) in Sec. 24.5 we now have the condition
2: 2: f(Xi' Yj)
(5)
=
1.
j
E X AMP LEI
Two-Dimensional Discrete Distribution
If we
~imilltaneously
toss a dime and a nickel and consider
x=
Number of heads the dime turns up,
Y = Number of heads the nickel turns up,
then X and Y can have the values 0 or 1. and the probability function is
fCO, 0) = fO, 0) = f(O, 1) = f(1, 1) =~,
f(x, y) =
°
otherwise.
•
1034
CHAP. 24
Data Analysis. Probability Theory
y
Fig. 522.
Notion of a two-dimensional distribution
Continuous Two-Dimensional Distributions
In analogy to the case of a single random variable (Sec. 24.5) we caIl (X, Y) and its
distribution continuous if the corresponding distribution function F(x, \') can be given by
a double integral
(6)
F(x, y)
=
I
,j
IX
-00
f(x*, y*) dx* dy*
-':X:l
whose integrand f, called the density of (X, Y), is nonnegative everywhere, and is
continuous, possibly except on finitely many curves.
From (6) we obtain the probability that (X, Y) assume any value in a rectangle
(Fig. 522) given by the formula
(7)
E X AMP L E 2
Two-Dimensional Uniform Distribution in a Rectangle
Let R be the rectangle "1 < x
(8)
~
f31' "2
f(x. y) = Ilk
< Y ~ f32' The density (see Fig. 523)
if (x. y)
i~
in R.
f(x, y) = 0 otherwise
defines the so-called uniform distribution ill the rectallgle R: here k = (f31 - "1)({32 - "2) is the area of R.
The distribution function is shown in Fig. 524.
•
y
x
x
o
Fig. 523.
Density function (8) of the
uniform distribution
o
Fig. 524. Distribution function of the
uniform distribution defined by (8)
Marginal Distributions of a Discrete Distribution
This is a rather natural idea, without counterpart for a single random variable. It amounts
to being interested only in one of the two variables in (X, Y), say, X, and asking for its
SEC. 24.9
1035
Distributions of Several Random Variables
distribution, called the marginal distribution of X in (X, V). So we ask for the probability
= x, Yarbitrary). Since (X, Y) is discrete, so is X. We get its probability function,
call it f1(x), from the probability function f(x, y) of (X, Y) by summing over y:
P(X
(9)
f1(x)
=
P(X
=
=
x, Yarbitrary)
2: f(x, y)
y
where we sum all the values of f(x, y) that are not 0 for that x.
From (9) we see that the distribution function of the marginal distribution of X is
(10)
Similarly, the probability function
(11)
f2(Y)
=
P(X arbitrary. Y
=
=
y)
2: f(x. y)
:r
determines the marginal distribution of Y in (X, V). Here we sum all the values of
f(x, y) that are not zero for the conesponding y. The distribution function of this marginal
distribution is
(12)
y*~y
E X AMP L E 3
Marginal Distributions of a Discrete Two-Dimensional Random Variable
In drawing 3 cards with replacement from a bridge deck let us consider
(X. Yl.
x=
NlIl11ber of queens.
Y = Number of kings or lIces.
The deck has 52 cards. These include 4 queens. 4 kings. and 4 aces. Hence in a single trial a queen has probability
4/52 = 1/13 and a king or ace 8/52 = 2/13. This gives the probability function of (X, Y),
3!
(-I
/<x,\") =
.
x! y! (3 - x - y)!
13
)X ( - 2 )Y ( - 10 )3-T-Y
13
(x
13
+ y
~
3)
and fIx. y) = 0 otherwise. Table 24.1 shows in the center the values of fIx, y) and on the right and lower margins
the values of the probability functions hex) am] hey) of the marginal distributions of X and Y, respectively . •
Table 24.1 Values of the Probability Functions f{x, y), fl{X), f 2{Y) in Drawing
Three Cards with Replacement from a Bridge Deck, where X is the Number
of Queens Drawn and Y is the Number of Kings or Aces Drawn
x
0
y
0
I
2
3
f1(x)
1000
2197
600
2197
120
2197
8
2197
1728
2197
,IOU
1
2197
120
2197
12
2197
0
432
2197
2
30
2197
6
2197
0
0
36
2197
3
1
2197
0
0
0
2197
f2(Y)
1331
2197
726
2197
132
2197
8
2197
1
1036
CHAP. 24
Data Analysis. Probability Theory
Marginal Distributions of a Continuous Distribution
This is conceptually the same as for discrete distributions. with probability functions and
sums replaced by densities and integrals. For a continuous random variable (X, Y) with
density f(x, y) we now have the marginal distribution of X in (X. Yl. defined by the
distribution function
(13)
Fl(x)
=
P(X ;:;; x,
-QO
<
Y
<
ce)
=
IX
fl(X*) dx*
-co
with the density f 1 of X obtained from f(x. y) by integration over y,
(14)
fleX)
=
I=
f(x, y) dy.
-x
Interchanging the roles of X and Y, we obtain the marginal distribution of Y in (X, Y)
with the distribution function
(15)
F 2(y) = P( - x
< X<
ce, y;:;; y) =
I
y
f2(V*) dy*
-x
and density
(16)
f2(Y)
=
fX f(x, y) dx.
-co
Independence of Random Variables
X and Y in a (discrete or continuous) random variable (X, Y) are said to be independent
if
(17)
holds for all (x, y). Otherwise these random variables are said to be dependent. These
definitions are suggested by the corresponding definitions for events in Sec. 24.3.
Necessary and sufficient for independence is
(IS)
for all x and y. Here the f's are the above probability functions if (X, Y) is discrete or
those densities if (X, Y) is continuous. (See Prob. 20.)
E X AMP L E 4
Independence and Dependence
In tossing a dime and a nickel. X = Number of heads all the dillie, Y = Number of headf all the nickel may
•
assume the values 0 or I and are independent. The random variables in Table 24.1 are dependent.
Extension of Independence to II-Dimensional Random Variables. This will be needed
throughout Chap. 25. The distribution of such a random variable X
determined by a distribution function of the form
= (Xl> ... , Xn) is
SEC. 24.9
1037
Distributions of Several Random Variables
The random variables Xl> .... Xn are said to be independent if
(19)
for all (Xl> ••• , xn). Here
of Xj in X, that is,
FiXj)
is the distribution function of the marginal distlibution
Otherwise these random variahles are said to be dependent.
Functions of Random Variables
When 11 = 2, we write Xl = X, X 2 = Y, Xl = X, X2 = y. Taking a nonconstant continuous
function g(x, y) defined for all x, y, we obtain a random variable Z = g(X, Y). For example,
if we roll two dice and X and Yare the numbers the dice turn up in a trial, then
Z = X + Y is the sum of those two numbers (see Fig. 513 in Sec. 24.5),
In the case of a discrete random variable (X, Y) we may obtain the probability function
f(:::.) of Z = g(X. Y) by summing all f(x, y) for which g(x, y) equals the value of :::.
considered; thus
(20)
=
f(:::.)
P(Z
LL f(x, y).
= :::.) =
g(x.y)~z
Hence the distribution function of Z is
(21)
=
F(::.)
P(Z ~ :::.)
=
LL f(x, y)
g(x,Y)""Z
where we sum all values of f(x, y) for which g(x, y) ~ z.
In the case of a continuous random variable eX, Y) we similarly have
(22)
F(z)
=
P(Z
~ z) =
ff
f(x, y) dx dy
g(x,Y)""z
where for each z we integrate the density f(x, y) of (X, Y) over the region g(x, y)
the xy-plane. the boundary curve of this region being g(x, v) = z.
~
z in
Addition of Means
The number
LL
(23)
x
E(g(X, Y»
=
{
x
g(x, y)f(x. y)
[(X, Y) discrete]
Y
x
ix ixg(X,
y)f(x, y) dx dy
[(X, Y) continuous]
CHAP. 24
1038
Data Analysis. Probability Theory
is called the mathematical expectatio/1 or, briefly, the expectation of g(X, Y). Here it is
assumed that the double series converges absolutely and the integral of Ig(x, y)I.f(x, y) over
the xy-plane exists (is finite). Since summation and integration are linear processes, we
have from (23)
(24)
E(ag(X, Y)
+ bh(X, Y»
=
aE(g(X, Y»
+ bE(h(X, Y».
An imponam special case is
E(X
+
Y)
= E(X) + E( Y),
and by induction we have the following result.
THEOREM 1
Addition of Means
The mean (expectation) of a sum of random variables equals the
(expectations), that is,
Sll1l1
of the means
Furthermore, we readily obtain
THEROEM 2
Multiplication of Means
The mean (e.\pectation) of the product (~lilldependellt random variables equals the
product qf the meam (expectatiolls), that is,
(26)
PROOF
If X and Yare independent random variahles (both discrete or hoth continuous), then
E(XY) = E(X)E(Y). In fact, in the di~crete case we have
E(XY)
=
2: 2: xyf(x, y)
x
y
=
2: xfl(x) 2: yf2(Y) = E(X)E(y),
y
and in the continuous case the proof of the relation is similar. Extension to
random variables gives (26), and Theorem 2 is proved.
/1
independent
•
Addition of Variances
This is another matter of practical impOltance that we shall need. As before, let Z = X + Y
and denote the mean and Valiance of Z by I-t and u 2 • Then we first have (see Team Project
16(a) in Problem Set 24.6)
SEC. 24.9
1039
Distributions of Several Random Variables
From (24) we see that the first term on the right equals
For the second term on the right we obtain from Theorem I
[E(z)f
=
[E(X)
+
E(y)]2
=
lE(X)f
+
2E(X)E(Y)
+
[E(y)]2.
By substituting these expressions into the formula for u 2 we have
u2
=
E(X2) - [E(X)]2
+
+ E(y2)
- [E(Y)]2
2[E(XY) - E(X)E(Y)].
From Team Project 16, Sec. 24.6, we see that the expression in the first line on the right
is the sum of the variances of X and Y, which we denote by U1 2 and U2 2, respectively.
The quantity in the second line (except for the factor 2) is
(27)
UXY = E(XY) - E(X)E(Y)
and is called the covariance of X and Y. Consequently, our result is
(28)
If X and Y are independent, then
E(XY)
= E(X)E(Y):
hence UXY = 0, and
(29)
Extension to more than two variables gives the basic
THEOREM 3
Addition of Variances
The variance of the sum of independent random variables equals the sum of the
variances of these variables.
CAUTION! In the numerous applications of Theorems 1 and 3 we must always
remember that Theorem 3 holds only for independent variables.
This is the end of Chap. 24 on probability theory. Most of the concepts, methods, and
special distributions discussed in this chapter will play a fundamental role in the next
chapter, which deals with methods of statistical inference, that is, conclusions from
samples to populations. whose unknown properties we want to know and try to discover
by looking at suitable properties of samples that we have obtained.
CHAP. 24
1040
Data Analysis. Probability Theory
1. Let f(x, y) = k when 8 ~ x ~ 12 and 0 ~ y ~ 2 and
zero elsewhere. Find k. Find P(X ~ 11, 1 ~ Y ~ 1.5)
and P(9 ~ X ~ 13, Y ~ I).
2. Find P(X > 2, Y> 2) and P(X ~ 1. Y ~ 1) if (X. Y)
has the density f{x, y) = 1/8 if x ~ O. Y ~ 0, X -t- Y ~ 4.
=
k if x
> O.
y
f(x,
y) =
2500
if
0.99 < x < 1.01. 1.00 < Y < 1.02
> 0, x +
y < 3 and 0
otherwise. Find k. Sketch f(x, y). Find P(X + Y ~ I),
P(Y> X).
3. Let f(x, y)
12. Let X [cm1 and Y [cm 1 be the diameter of a pin and
hole. respectively. Suppose that (X, Y) has the
density
4. Find the density of the marginal distribution of X in
Prob 2.
5. Find the density of the marginal distribution of Y in
Fig. 523.
and 0 otherwise. (a) Find the marginal distributions.
(b) What is the probability that a pin chosen at random
will fit a hole whose diameter is l.00?
13. An electronic device consists of two components. Let
X and Y [months] be the length of time until failure of
the first and second component, respectively. Assume
that (X, Y) has the probability density
6. If certain sheets of wrapping paper have a mean weight
of 10 g each. with a standard deviation of 0.05 g. what
are the mean weight and standard deviation of a pack
of IO 000 sheets?
7. What are the mean thickness and the standard deviation
of transformer cores each consisting of 50 layers of
sheet metal and 49 insulating paper layers if the metal
sheets have mean thickness 0.5 mm each with a
~tandard deviation of 0.05 mm and the paper layers
have mean 0.05 mm each with a standard deviation of
O.02mm?
S. If the weight of certain (empty) containers has mean
2 Ib and standard deviation 0.1 lb. and if the filling of
the containers has mean weight 751b and standard
deviation 0.8 lb. what are the mean weight and standard
deviation of filled containers'!
9. A 5-gear assembly is put together with spacers between
the gears. The mean thickness of the gear~ is 5.020 em
with a standard deviation of 0.003 cm. The mean
thickness of the spacers is 0.040 cm with a standard
deviation of 0.002 cm. Find the mean and standard
deviation of the a~sembled units consisting of 5 randomly
selected gears and 4 randomly selected spacers.
10. Give an example of two different discrete distributions
that have the same marginal distributions.
f(x, y) = 0.01 e -O.l(x+y)
if x > 0 and y > 0
and 0 otherwise. (a) Are X and Y dependent or
independent? (b) Find the densities of the marginal
distributions. (c) What is the probability that the first
component has a lifetime of 10 months or longer?
14. Prove (2).
15. Find P(X > Y) when (X. Y) has the density
ftx, Y) = 0.25e- O.5 (x+y) if x
~
0, y
~
0
and 0 otherwise.
16. Let (X. Y) have the density
f(x,
y) =
k if x 2
+ y2 <
1
and 0 otherwise. Determine k. Find the densities of
the marginal distributions. Find the probability
P(X 2
+
y2
<
1/4).
17. Let (X, Y) have the probability function
f(O.o)
=
f(1. I)
=
lIS.
f(O, 1) = f(l. 0) = 3/8.
11. Show that the random vatiables with the densities
Are X and Y independent?
f(l:, y) =
X
+Y
and
g(x, y) = (x
if 0
~
x
~
g(x. y) =
distribution.
+ ~)(y + ~)
I, 0 ~ Y ~ I and f(x. y) = 0 and
0 elsewhere. have the same marginal
IS. Using Theorem 1, obtain the formula for the mean of
the hypergeometrie distribution. Can you use Theorem
3 to obtain the variance of that distribution?
19. Using Theorems I and 3, obtain the formulas for the
mean and the variance of the binomial distribution.
20. Prove the statement involving (18).
1041
Chapter 24 Review Questions and Problems
. ::a.::iI ...':==IU STIONS AND PROBLEMS
1. Why did we begin the chapter with a section on handling
data?
23. Find the mean, standard deviation, and variance in
Prob.21.
2. What are stem-and-Ieaf plots? Boxplots? Histograms?
Compare their advantages.
24. Find the mean, standard deviation, and variance
Prob.22.
3. What quantities measure the average size of data? The
spread?
25. What are the outcomes of the sample space of
4. Why did we consider probability theory? What is its
26. What are the outcomes in the sample space of the
experiment of simultaneously tossing three coins?
role in statistics?
In
X: Tossing a coin until the first Head appears?
5. What do we mean by an experiment? By a random
variable related with it? What are outcomes? Events?
27. A box contains 50 screws, five of which are defective.
Find the probability function of the random variable
6. Give examples of experiments in which you have
equally likely cases and others in which you don't.
X = Number of defective screws in drawing tl1'O screws
without replacement and compute its values.
7. State the definition of probability from memory.
28. Find the values of the distribution function in Prob. 27.
8. What is the difference between the concepts of a
permutation and a combination?
29. Using a Venn diagram, show that A <::;; B if and only if
AUB=B.
9. State the main theorems on probability. IIIustmte them
by simple examples.
30. Using a Venn diagram, show that A <::;; B if and only if
10. What is the distribution of a random variable? The
distribution function? The probability function'! The
density?
31. If X has the density f(x) = 0.5x (0 :;::: x :;::: 2) and
o otherwise, what are the mean and the Vallance of
X* = -2X + 5?
11. State the definitions of mean and variance of a random
variable from memory.
32. If 6 different inks are available, in how many ways can
we select two colors for a printing job? Four colors?
12. If peA)
=
PCB) and A <::;; B, can A =1= B?
13. If E =1= S (= the sample space), can P(E)
=
I?
14. What distributions correspond to sampling with
replacement and without replacement?
15. When will an experiment involve a binomial
distribution? A hypergeometric distribution?
16. When will the Poisson distribution be a good
approximation of the binomial distribution?
17. What do you know about the approximation of the
binomial distribution by the normal distribution?
An B =A.
33. Compute 5! by the Stirling formula and find the absolute
and relative errors.
34. Two screws are randomly drawn without replacement
from a box containing 7 right-handed and 3 lefthanded screws. Let X be the number of left-handed
screws drawn. Find P(X = 0), P(X = I), P(X = 2),
P(I
<
X
<
2), P(O
<
X
<
5).
35. Find the mean and the variance of the distribution
having the density f(x) = ~e-Ixl.
36. Find the skewness of the distribution with density
f(x) = 2(1 - x) if 0 < x < I, f(x) = 0 otherwise.
18. Explain the use of the tables of the normal distribution.
If you have a CAS, how would you proceed without the
tables?
37. Sketch the probability function f(x) = x 2 /30
(x = 1, 2, 3,4) and the distribution function. Find
19. Can the probability function of a discrete random
variable have infinitely many positive values?
38. Sketch F(x) = 0 if x :;::: 0, F(x) = 0.2x if 0
F(x) = 1 if x > 5, and its density f(x).
20. State the most important facts about distributions of two
random variables and their marginal distributions.
39. If the life of tires is normal with mean 25 000 km and
variance 25 000 000 km 2 , what is the probability that a
given one of those tires will last at least 30 000 km? At
least 35000 km?
21. Make a stem-and-Ieaf plot, histogram, and boxplot of
the data 22.5. 23.2, 22.1, 23.6, 23.3, 23.4, 24.0, 20.6,
23.3.
22. Do the same task as in Prob. 21, for the data 210, 213,
209,218,210,215,204,211,216,213.
<
/-L.
x :;::: 5,
40. If the weight of bags of cement is normal with mean
50 kg and standard deviation 1 kg, what is the
probability that 100 bags will be heavier than 5030 kg?
1042
CHAP. 24
Data Analysis. Probability Theory
: ' ·-11 :
Data Analysis. Probability Theory
A random experiment, briefly called experiment, is a process in which the result
("outcome") depends on "chance" (effects of factors unknown to us). Examples are
games of chance with dice or cards, measuring the hardness of steel. observing
weather conditions, or recording the number of accidents in a city. (Thus the word
"experiment" is used here in a much wider sense than in common language.) The
outcomes are regarded as points (elements) of a set S, called the sample space,
whose subsets are called events. For events E we define a probability PtE) by the
axioms (Sec. 24.3)
o~
(1)
peE)
u
£2 U ... )
=
I
= I
peS)
peEl
~
peEl)
+ P(£2) +
These aXiOms are motivated by properties of frequency distributions of data
(Sec. 24.1).
The complement ~ of E has the probability
(2)
=
P(EC )
1 - peE).
The conditional probability of an event B under the condition that an event A
happens is (Sec. 24.3)
(3)
p(BIA)
=
peA n B)
peA)
[peA)
> 0].
Two events A and B are called independent if the probability of their simultaneous
appearance in a trial equals the product of their probabilities, that is, if
(4)
PtA
n
B)
=
P(A)P(B).
With an experiment we associate a random variable X. This is a function defined
on S whose values are real numbers; furthermore, X is such that the probability
p(X = a) with which X assumes any value a, and the probability pea < X ~ b)
with which X assumes any value in an interval a < X ~ b are defined (Sec. 24.5).
The probability distribution of X is determined by the distribution function
(5)
F(x)
=
P(X
~
x).
In applications there are two important kinds of random variables: those of the
discrete type, which appear if we count (defective items, customers in a bank, etc.)
and those of the continuous type, which appear if we measure (length, speed,
temperature, weight, etc.).
1043
Summary of Chapter 24
A discrete random variable has a probability function
f(x) = P(X = x}.
(6)
Its mean 11- and variance a 2 are (Sec. 24.6)
(7)
11-
=
L
u2
and
xjf(xj)
L
=
j
(Xj - 11-}2f(xj)
j
where the Xj are the values for which X has a positive probability. Important discrete
random variables and distributions are the binomial. Poisson. and hypergeometric
distributions discussed in Sec. 24.7.
A continuous random variable has a density
(8)
[see (5)j.
f(x} = F'(x}
Its mean and variance are (Sec. 24.6)
(9)
11- =
fO
xf(x) dx
u2 =
and
-=
fC
(x - 11-)2f(x) dt:.
-oc
Very important is the normal distribution (Sec. 24.8), whose density is
I
f(1;) = - - exp
(l0)
u\,f2;
[1 (x -
11_._u
- 2
)2J
and whose distribution function is (Sec. 24.8: Tables A 7. A8 in App. 5)
(II)
A two-dimensional random variable (X, Y) occurs if we simultaneously observe
two quantities (for example, heightX and weight Yof adults}. Its distribution function
is (Sec. 24.9)
( 12)
F(x, y}
=
P(X
~
x, Y
~ y}.
X and Y have the distribution functions (Sec. 24.9)
(13)
FI(x}
=
P(X
~
x, Yarbitrary)
and
F 2 (y)
= P(x arbitrary,
Y
~
y)
respectively; their distribution-; are called marginal distributions. If both X and Y
are discrete. then (X. Y) has a probability function
f(x, y}
=
P(X
=
x, Y
=
y}.
If both X and Yare continuous. then (X. Y) has a density f(x, y).
CHAPTER
Of
2 5
Mathematical Statistics
In probability theory we set up mathematical models of processes that are affected by
"chance". In mathematical statistics or, briefly. statistics, we check these models against
the observable reality. This is called statistical inference. It is done by sampling, that
is, by drawing random samples, briefly called samples. These are sets of values from a
much larger set of values that could be studied, called the popUlation. An example is
10 diameters of screws drawn from a large lot of screws. Sampling is done in order to
see whether a model of the population is accurate enough for practical purposes. If this
is the case, the model can be used for predictions, decisions, and actions, for instance, in
planning productions, buying equipment, investing in business projects, and so on.
Most important methods of statistical inference are estimation of parameters
(Secs. 25.2). determination of confidence intervals (Sec. 25.3), and hypothesis testing
(Secs. 25.4. 25.7. 25.8). with application to quality control (Sec. 25.5) and acceptance
sampling (Sec. 25.6).
In the last section (25.9) we give an introduction to regression and correlation analysis,
which concern experiments involving two variables.
Prerequisite: Chap. 24.
Sections that may be omitted in a shorter course: 25.5, 25.6. 25.8.
References, Answers to Problems. a1ld Statistical Tables: App. I Part G, App. 2,
App.5.
25.1
Introduction.
Random Sampling
Mathematical statistics consists of methods for designing and evaluating random
experiments to obtain information about practical problems. such as exploring the relation
between iron content and density of iron ore, the quality of raw material or manufactured
products, the efficiency of air-conditioning systems, the performance of certain cars, the
effect of advertising, the reactions of consumers to a new product, etc.
Random variables occur more frequently in engineering (and elsewhere) than one
would think. For example, properties of mass-produced articles (screws, lightbulbs. etc.)
always show random variation, due to small (uncontrollable!) differences in raw material
or manufacturing processes. Thus the diameter of screws is a random variable X and we
have nOlldefecfive screws. with diameter between given tolerance limits, and defective
screws, with diameter outside those limits. We can ask for the distribution of X, for the
percentage of defective screws to be expected, and for necessary improvements of the
production process.
1044
SEC. 25.1
Introduction.
1045
Random Sampling
Samples are selected from populations-20 screws from a lot of 1000, 100 of 5000
voters, 8 beavers in a wildlife conservation project-because inspecting the entire
population would be too expensive, time-consuming, impossible or even senseless (think
of destructive testing of lightbulbs or dynamite). To obtain meaningful conclusions,
samples must be random selections. Each of the 1000 screws must have the same chance
of being sampled (of being drawn when we sample), at least approximately. Only then
will the sample mean x = (Xl + ... + X20)120 (Sec. 24.1) of a sample of size 11 = 20
(or any other 11) be a good approximation of the population mean JL (Sec. 24.6); and the
accuracy of the approximation will generally improve with increasing II, as we shall see.
Similarly for other parameters (standard deviation, variance, etc.).
Independent sample values will be obtained in experiments with an infinite sample
space S (Sec. 24.2), certainly for the normal distribution. This is also true in sampling with
replacement. It is approximately true in drawing small samples from a large finite population
(for instance. 5 or 10 of 1000 items). However. if we sample without replacement from a
small population, the effect of dependence of sample values may be considerable.
Random numbers help in obtaining samples that are in fact random selections. This
is sometimes not easy to accomplish because there are many subtle factors that can bias
sampling (by personal interviews, by poorly working machines, by the choice of nontypical
observation conditions, etc.). Random numbers can be obtained from a random number
generator in Maple, Mathematica, or other systems listed on p. 991. (The numbers are
not truly random, as they would be produced in flipping coins or rolling dice, but are
calculated by a tricky formula that produces numbers that do have practically all the
essential features of true randomness.)
E X AMP L E 1
Random Numbers from a Random Number Generator
To select a sample of size n = 10 from 80 given ball bearings, we number the bearings from I to 80. We then
let the generator randomly produce 10 of the integers from I to 80 and include the bearings with the numbers
obtained in our sample. for example.
44 55 53 03 52 61
67 78 39 54
or whatever.
Random numbers are also contained in (older) statistical tables.
•
Representing and processing data were considered in Sec. 24.1 in connection with
frequency distributions. These are the empirical counterparts of probability distributions
and helped motivating axioms and properties in probability theory. The new aspect in this
chapter is randomness: the data are samples selected randomly from a population.
Accordingly, we can immediately make the connection to Sec. 24.1. using stem-and-leaf
plots, box plots. and histograms for representing samples graphically.
Also, we now call the mean x in (5), Sec. 24.1, the sample mean
(1)
We call
(2)
11
the sample size, the variance
S2
in (6), Sec. 24.1, the sample variance
1046
CHAP. 25
Mathematical Statistics
and its positive square root s the sample standard deviation..\', 52, and
parameters ~l a sample; they will be needed throughout this chapter.
25.2
5
are called
Point Estimation of Parameters
Beginning in this section, we shall discuss the most basic practical tasks in statistics and
corresponding statistical methods to accomplish them. The first of them is point estimation
of parameters, that is, of quantities appearing in distributions, such as p in the binomial
distribution and JL and u in the normal distribution.
A point estimate of a parameter is a number (point on the real line). which is computed
from a given sample and serves as an approximation of rhe unknown exact value of the
parameter of the population. An interval estimate is an interval ("confidence il1terval"')
obtained from a sample; such estimates will be considered in the next section. Estimation
of parameters is of great practical importance in many applications.
As an approximation of the mean JL of a popUlation we may take the mean .X' of a
corresponding sample. This gives the estimate /L = .\' for JL, that is,
(1)
JL=X=
tXl
+ ... + x.,,)
11
where n is the sample size. Similarly. an estimate &2 for the variance of a popUlation is
the variance S2 of a corresponding sample, that is,
(2)
Clearly, (1) and (2) are estimates of parameters for distributions in which JL or u 2
appear explicity as parameters. such as the normal and Poisson distributions. For the
binomial distribution, p = JLln lsee (3) in Sec. 24.71. From (1) we thus obtain for p
the estimate
(3)
jJ=
x
1l
We mention that (1) is a special case of the so-called method of moments. In this
method the parameters to be estimated are expressed in terms of the moments of the
distribution (see Sec. 24.6). In the resulting formulas those moments of the distribution
are replaced by the corresponding moments of the sample. This gives the estimates. Here
the kth moment of a sample X10 •••• Xn is
SEC. 25.2
1047
Point Estimation of Parameters
Maximum Likelihood Method
Another method for obtaining estimates is the so-called maximum likelihood method of
R. A. Fisher [Messellger Math. 41 (1912). 155-160]. To explain it we consider a discrete
(or continuous) random variable X whose probability function (or density) f(x) depends
on a single parameter e. We take a corresponding sample of 11 illdepelldent values
Xl' • • • • X1]" Then in the discrete ca~e the probability that a sample of size 1l consists
precisely of those 11 values is
(4)
In the continuous case the probability that the sample consists of values in the small
intervals}.; ~ x ~ Xj + tu (j = 1,2, .. ',11) is
(5)
Since f(xj) depends on e, the function I in (5) given by (4) depends on Xl • . • . , Xn and
e. We imagine Xl, • • . , Xn to be given and fixed. Then I is a function of e. which is called
the likelihood function. The basic idea of the maximum likelihood method is quite simple,
as follows. We choose that approximation for the unknown value of e for which I is as
large as possible. If I is a differentiable function of e, a necessary condition for I to have
a maximum in an interval (not at the boundary) is
ae = o.
(6)
(We write a partial derivative. because I depends also on Xlo • • • • xn") A solution of (6)
depending onx1' ... , Xn is called a maximum likelihood estimate for e. We may replace
(6) by
a In I
(7)
ae
= 0,
because f(xj) > 0, a maximum of I is in general positive, and In I is a monotone increasing
function of I. This often simplifies calculations.
Several Parameters. If the distribution of X involves r parameters e1 , . . • , en then
instead of (6) we have the r conditions al/ae1 = 0, ... , rJllae,. = 0, and instead of (7)
we have
(8)
E X AMP L E 1
a In I
ae,. = o.
= O.
Normal Distribution
Find maximum likelihood estimates tor
Solutioll.
(JI
= /L and
(J2
= u in the case of the normal distribution.
From (1). Sec. 24.8. and (4) we obtain the likelihood function
I
=
(~)n (~)n
"271"
u
e- h
where
1048
CHAP. 25
Mathematical Statistics
Taking logarithms. we have
In 1=
-11
In
v T;;- -
/I
In u - h.
The first equation in (8) is vOn I)lvf.L = O. written out
hence
L"
'\J - /If.L = O.
j~l
The ~olution is the desired estimate [L for f.L: we find
LXj=.r.
Il j=l
The
~econd
equation in (8)
i~
a(ln l)trlu = O. written out
v In 1
11
ill!
u
au
Replacing f.L by [Land solving for u 2 • we obtain the estimate
-2
U
1 ~
_2
= - L.. (Xj - xl
n
j=l
which we shall use in Sec. 25.7. Note that this differs from (2). We cannot discuss criteria for the goodness of
estimates but want to mention that for 'mall 11. formula (2) is preferable.
•
--•...---_............
......... _......
...
... ...
..-. ..-.
~--.-~
1. Find the maximum likelihood estimate for the
parameter f.L of a nOlmal distribution with known
variance u 2 = uo2 .
2. Apply the maximum likelihood method to the normal
distribution with f.L = O.
3. (Binomial distribution) Derive a maximum likelihood
estimate for p.
4. Extend Prob. 3 as follows. Suppose that 111 times 11
trials were made and in the first 11 trials A happened
kl times, in the second n trials A happened k2 times,
... , in the mth 11 triab A happened k m times. Find a
maximum likelihood estimate of p based on this
information.
5. Suppose that in Prob. 4 we made 4 times 5 trials and
A happened 2, I, ..k 4 times, respectively. Estimate p.
6. Consider X = Number of independent trials IImil all
el'ent A occurs. Show that X has the probability
function f(x) = pqX-l. X = l. 2 ..... where p is the
probability of A in a single trial and q = I - p. Find
the maximum likelihood estimate of p corresponding
to a sample .\'1, ••• , Xn of observed values of X.
7. In Prob. 6 find the maximum likelihood estimate of p
corresponding to a single observation x of X.
8. In rolling a die. suppose that we get the first Six in the
7th trial and in doing it again we get it in the 6th trial.
Estimate the probability p of getting a Six in rolling
that die once.
9. (Poisson distribution) Apply the maximum likelihood
method to the Poisson distribution.
10. (Uniform distribution) Show that in the case of the
parameters a and b of the uniform distribution (see
Sec. 24.6), the maximum likelihood estimate cannot be
obtained by equating the first derivative to zero. How
can we obtain maximum likelihood estimates in this
case?
11. Find the maximum likelihood estimate of e in the
density f(x) = ee- HX if x ~ 0 and f(x) = 0 if x < O.
12. In Prob. I I. find the mean f.L. substitute it in fex). find
the maximum likelihood estimate of IL. and show that
It is identical with the estimate for f.L which can be
obtained from that for e in Prob. I I.
13. Compute in Prob. 11 from the sanlple 1.8, 0.4. 0.8.
0.6. 1.4. Graph the sample distribution function Fcx)
and the distribution function F(x) of the random
variable, with e =
on the same axes. Do they agree
reasonably well? (We consider goodness of fit
systematically in Sec. 25.7.)
e
e.
SEC 25.3
Confidence Intervals
14. Do the same task as in Frob. 13 if the given sample is
0.5.0.7.0.1. 1.1. 0.1.
15. CAS EXPERIMENT.
Maximum Likelihood
Estimates. (MLEs). Find experimentally how much
25.3
1049
MLEs can differ depending on the ~ample ~ize. Hillf.
Generate many samples of the same size II. e.g.. of the
standardized normal distribution. and record i and S2.
Then increase /I.
Confidence Intervals
Confidence intervals 1 for an unknown parameter 8 of some distribution (e.g .. 8 = J-L) are
intervals 81 :2: 8 :2: 82 that contain 8, not with certainty but with a high probability 'Y.
which we can choose (95% and 99% are popular). Such an interval is calculated from a
sample. 'Y = 95% means probability I - 'Y = 5% = 1/20 of being wrong--Dne of about
20 such intervals will not contain 8. Instead of writing 81 :2: e ~ 82 , we denote this more
distinctly by writing
(1)
Such a special symbol, CONE seems worthwhile in order to avoid the misunderstanding
that 8 mllst lie between 81 and 82 ,
'Y is called the confidence level, and 81 and 82 are called the lower and upper
confidence limits. They depend on 'Y. The larger we choose 'Y. the smaller is the error
probability 1 - 'Y, but the longer is the confidence interval. If 'Y ---7 I, then its length goes
to infinity. The choice of 'Y depends 011 the kind of application. In taking no umbrella, a
5% chance of getting wet is not tragic. In a medical decision of life or death, a 5% chance
of being wrong may be too large and a I % chance of being wrong ('Y = 99%) may be
more desirable.
Confidence intervals are more valuable than point estimates (Sec. 25.2). Indeed. we can
take the midpoint of (1) as an approximation of 8 and half the length of ( I) as an "error
bound" (not in the strict sense of numerics. but except for an error whose probability we
know).
81 and 82 in (1) are calculated from a sample Xl, . . . • X n . These are 11 observations of
a random variable X. Now comes a standard trick. We regard Xl, ••• , X 11 as single
observations of n random variables Xl' ... , Xn (with the same distribution, namely, that
ofX)· Then 81 = 81 (X1, ••• , xn) and 82 = 82 (x 1 , . • • , xn) in (I) are observed values of
two random variables 8 1 = 8 1 (X1 , . . • , Xn) and 8 2 = 8 2 (X1 , ••• , X,,). The condition
(I) involving 'Y can now be written
(2)
Let us see what all this means in concrete practical cases.
In each case in this section we shall first state the steps of obtaining a confidence interval
in the form of a table, then consider a typical example, and finally justify those steps
theoretically.
1 JERZY NEYMAN (1894-1981 l. American statistician, developed the theory of confidence intervals (Alll/als
of Mathematical Statistics 6 (1935). 111-116).
CHAP. 25
1050
Mathematical Statistics
Confidence Interval for JL of the Normal Distribution
with Known (J"2
Table 25.1 Determination of a Confidence Interval for the Mean p. of
a Normal Distribution with Known Variance u 2
Step 1. Choose a confidence level y (95%, 99%, or the like).
Step 2. Determine the conesponding c:
'Y
c
I 0.90
1.645
0.95
0.99
0.999
1.1)60
2.576
3.21) 1
Step 3. Compute the mean .f of the sample Xl>
•••• Xu-
Step -I. Compute k = cuIyr-;;. The confidence interval for jL is
(3)
E X AMP L E 1
CONFy
Ix -
k
~
p. ~
x+
k).
Confidence Interval for jL of the Normal Distribution with Known u 2
Deterimine a 95'ff confidence interval for the mean of a normal distribution with variance
sample of 11 = 100 values with mean x = 5.
0-
2
=
9. using a
Solution. Step 1. l' = 0.95 is required. Step 2. The corresponding c equals 1.960; see Table 25.1.
Step 3..r = 5 is ghen. Step 4. We need k = 1.960' 3/v'lOo = 0.588. Hence r - k = 4.412..1' + k = 5.588
and the confidence interval is CONFo.95 [4.412 ::'" /L::'" 5.588}.
This is sometimes written /L = 5 ::':: 0.588, but we shall not lise this notation, which can be misleading.
With your CAS YOll can determine this interval more directly. Similarly for the other examples in this ,eetion...
Theory for Table 25.1.
THEOREM 1
The method in Table 25.1 follows from the basic
Sum of Independent Normal Random Variables
Let Xl' ... , X" be illdependent nonnal random variables each
jL and I'ariallce u 2 . Theil the followillg holds.
la) The Slllll Xl
+ ... +
has mean
Xn is Ilonlla! with meall IljL alld variallce IlU 2 .
(b) The following random variable
(4)
(~fwhich
X
I
= -
n
X is normal with mean
(Xl
jL and variallce u 2 /n.
+ ... + Xn)
le) The followillg ralldom variable Z is Ilonllal with melill 0 alld variallce l.
(5)
PROOF
The statements about the mean and variance in (a) follow from Theorems 1 and 3 in
Sec. 24.9. From this and Theorem 2 in Sec. 24.6 we see that X has the mean (I/1l)lljL = jL
and the variance (l11ly2nu 2 = u 2 /n. This implies that Z has the mean 0 and variance I.
by Theorem 2(b) in Sec. 24.6. The nonnality of Xl + ... + Xn is proved in Ref. [031
listed in App. 1. This implies the normality of (4) and (5).
•
SEC. 25.3
Confidence Intervals
1051
Derivation of (3) in Table 25.1. Sampling from a normal distribution gives independent
sample values (see Sec. 25.1). so that Theorem I applies. Hence we can choose 'Yand
then determine c such that
(6)
P( -c
~
Z
~
c) = P ( -c
X-11-
~ ~ ~
a/v 11
c ) = <D(c) - (1:>( -c) = 'Y.
For the value 'Y = 0.95 we obtain ::.(D) = 1.960 from Table AS in App. 5. as used in
Example 1. For 'Y = 0.9. 0.99. 0.999 we get the other values of c listed in Table 25.1.
Finally. all we have to do is to convert the inequality in (6) into one for 11- and insert
observed values obtained from the sample. We multiply -c ~ Z ~ c by -I and then by
a/V;;. writing car\!;; = k (as in Table 25.1),
P( - c
~ Z ~ c) = P( c ~
- Z
~
- c)
=
P (c
= PCk
Adding X gives P(X
(7)
+
k
~ JL ~
X - k)
P(X - k
=
'Y
~ JL ~
~
JL - X
:> -
a/V;; =
~ JL -
X
~
c)
-k) = 'Y.
or
X
+
k)
= y.
Inserting the observed value.X' of X gives (3). Here we have regarded Xl> • • • • Xn as single
observations of Xl' ...• X" (the standard trick!). so tha!..xI + ... + Xn is an observed
value of Xl + ... + Xn and .X' is an observed value of X. Note further that (7) is of the
•
form (2) with 8 1 = X - k and 8 2 = X + k.
E X AMP L E 2
Sample Size Needed for a Confidence Interval of Prescribed Length
How large must
Solution.
11
be in Example I if we want to obtain a 95% confidence interval of length L = OA?
ll1e interval (3) has the length L = 2k =
'leu'v;,. Solving for 11. we obtain
In the present case the answer is 11 = (2 . 1.960' 310.4)2 = 870.
Figure 525 shows how L decreases as 11 increases and that for l' = 99% the confidence interval is substantially
longer than for l' = 95% (and the same sample size Ill.
•
0.6 r - , , - - - - - - - - - ,
0.4
Llu
0.2
-+00
n
L-- _
500
Fig. 525. Length of the confidence interval (3) (measured in multiples of tT)
as a function of the sample size n for 'Y = 95% and y = 99%
1052
CHAP. 25
Mathematical Statistics
Confidence Interval for J.L of the Normal Distribution
With Unknown 0-2
In practice a 2 is frequently unknown. Then the method in Table 25.1 does not help and
the whole theory changes, although the steps of determining a confidence interval for /-L
remain quite similar. They are shown in Table 25.2. We see that k differs from that in
Table 25.1, namely, the sample standard deviation s has taken the place of the unknown
standard deviation a of the popUlation. And c now depends on the sample size 11 and must
be determined from Table A9 in App. 5 or from your CAS. That table lists values z for
given values of the distribution function (Fig. 526)
F(~) =
(8)
I_=
Z
K",
u2
(
I
+ -
)-(1n+
1)/2
du
III
of the t-distribution. Here, I1l (= I, 2, ... ) is a parameter, called the number of degrees
of freedom of the distribution (abbreviated d.f.). [n the present case.
I7l = 1l I; see Table 25.2. The constant Km is such that F(x) = 1. By integration it
turns out that Km = r(~111 + ~)/[V;;;;: r(~1Il)]. where r is the gamma function (see (24)
in App. A3.1).
T 'lIe 25.2 Determination of a Confidence Interval for the Mean I.t
of a Normal Distribution with Unknown Variance 0"2
Step 1. Choose a confidence level y (95%.99%. or the like).
Step 2. Determine the solution c of the equation
(9)
F(c)
=
~(l
+
y)
from the table of the t-distribution with 11 - I degrees of freedom
(Table A9 in App. 5; or use a CAS; 11 = sample size).
Step 3. Compute the mean
x
and the variance
S2
of the sample
X b · · · 'Xno
Step 4. Compute k
(10)
=
cst\;;;. The confidence interval is
CONFl'
{x -
k :2i /-L :2i
x+
k}.
y
3 d.f.
r.~~f~
l.°l~
r
y
0.8
0.6
o
d'
-3
-2
Fir 526.
-1
0
2
3
x
Distribution functions of the tdistnbution with 1 and 3 dJ. and of the
standardized normal distribution (steepest curve)
Fig. 527. Densities of the t-distribution
with 1 and 3 dJ. and of the standardized
normal distribution
SEC. 25.3
Confidence Intervals
1053
Figure 527 compares the curve ofthe density ofthe t-distribution with that of the normal
distribution. The latter is steeper. This illustrates that Table 25.1 (which uses more
information, namely, the known value of ( 2 ) yields shorter confidence intervals than Table
25.2. This is confirmed in Fig. 528, which also gives an idea of the gain by increasing
the sample size.
2,--,-rr-----,----,----,
\\
L'lL l.5
lL-____~____~_____ L_ _ _ _~
o
10
20
11
Fig. 528. Ratio of the lengths L' and L of the confidence
intervals (10) and (3) with y = 95% and y = 99% as a function
of the sample size n for equals and (T
E X AMP L E 3
Confidence Interval for p. of the Normal Distribution with Unknown u 2
Five independent measurements of the point of int1ammation (flash point) of Diesel oil (D-2) gave the values
(in OF) 144 147 146 142 144. Assuming normality. determine a 99% confidence interval for the mean.
Solutioll.
Step 1. y =
Step 2. FCc) = ~(1
Step 3.."i' = 144.6,
+
O.l)l)
is required.
y) = 0.995. and Table A9 in App. 5 with
s2 =
11 -
I = 4 d.f. gives c = 4.60.
3.8.
Step 4. k = V3.8. 4.60/Vs = 4.01. The confidence interval is CONFo.99 {140.5 ;;;; fL ;;;; 148.7J.
If the variance (T2 were known and equal 10 the sample variance s2. thus 0"2 = 3.8. then Table 25.1 would
give k = culV-;' = 2.576V3.8/V5 = 2.25 and CONFR99 {142.35;;;; fL;;;; 146.85J. We see that the present
interval is almost twice as long as that obtained from Table 25.1 (with 0"2 = 3.8). Hence for small sample, the
ditlerence is considerable! Sec also Fig. 528.
•
Theory for Table 25.2.
THEOREM 2
For deriving (10) in Table 25.2 we need from Ref. lG3]
Student's t-Distribution
Let Xl ..... Xn be independent normal random variables with the same mean p.
and the slime raricmce a 2 • Then the rllndom variable
X-p.
(11)
has a t-distribution [see (8)] with
by (4) alld
(12)
T=--
S/V;;
11
I degrees oj ji·eedo/Jl (d.f.): here X is given
1054
CHAP. 25
Mathematical Statistics
Derivation of (10). This is similar to the derivation of (3). We choose a number 'Y
between 0 and I and determine a number (' from Table A9 in App. 5 with 11 - I dJ. (or
from a CAS) such that
(13)
P(-c
~ T~
=
c)
F(c) - F(-c)
= ')'.
Since the t-distribution is symmetric, we have
F( -c) = I - F(c),
and (13) assumes the form (9). Substituting (11) into (13) and transforming the result as
before, we obtain
(14)
~
P(X - K
/-L ~ X
+
K) = 'Y
where
K = cS/Y;;.
By inserting the observed values .'t of X and
S2
of S2 into (14) we finally obtain (10) . •
Confidence Interval for the Variance
of the Normal Distribution
0-
2
Table 25.3 shows the steps. which are similar to those in Tables 25.1 and 25.2.
Table 25.3 Determination of a Confidence Interval for the Variance
0"2 of a Normal Distribution, Whose Mean Need Not Be Known
Step 1. Choose a confidence level 'Y (95%, 99%. or the like).
Step 2. Determine solutions
Cl
and
C2
of the equations
(15)
from the table of the chi-square distribution with 11 - I degrees of
freedom (Table AlO in App. 5; or use a CAS: 11 = sample size).
Step 3. Compute (n - 1 )S2, where
Step 4. Compute kl = (11 confidence interval is
S2
l)s2/c ]
is the variance of the sample
and
k2 =
(/1 -
I)S2/C2 .
The
(16)
E X AMP L E 4
Confidence Interval for the Variance of the Normal Distribution
Detennine a 95'i!' confidence interval (16) for the variance. using Table 25.3 and a sample (tensile strell"th of
.
2
.
e
sheet steel m kg/mm . rounded to mteger values)
89
84
R7
81
89
R6
91
90
78
89
R7
99
83
R9.
SEC. 25.3
1055
Confidence Intervals
Solution. Step 1.
Step 2. For
11 -
'Y = 0.95 is required.
I = 13 we find
("I =
Step 3. 13s2
=
Step -I. l3s 2 1c1
5.01
and
c2
= 24.7~.
326.9.
= 65.25.
l3s 21c2
= 13.21.
The confidence interval is
CONFo.95 {l3.2l ~
This
IS
fT2
~ 65.25}.
rather large, and for obtaining a more precise result, one would need a much larger sample.
•
Theory for Table 25.3. In Table 25.1 we used the normal distribution. in Table 25.2
the t-distribution. and now we shall use the X2-distribution (chi-square distribution),
whose distlibution function is F(:::) = 0 if.: < 0 and
F(.:)
=
em
f
z
e-u/211cm-2)/2
du
if.:
~
0
(Fig. 529).
o
y
0.8
0.6
0.4
0.2
o
Fig. 529.
4
6
10
8
x
Distribution function of the chi-square distribution with 2, 3, 5 dJ.
The parameter III (= 1. 2, ... ) is called the number of degrees of freedom (d.L), and
Note that the distribution is not symmetric (see also Fig. 530).
For deriving (16) in Table 25.3 we need the following theorem.
THEOREM 3
Chi-Square Distribution
Under the assllmptions il1 Theorem 2 the random variable
(17)
with S2 giren by (12) has a chi-square distribution with
Proof in Ref. [03]. listed in App. I.
11 -
I degrees offreedom
1056
CHAP. 25
Mathematical Statistics
y
0.5
\
0.4
2 d.f.
0.3
0.2
I
I
,~
3 d.f.
0.1 :
o
Fig. 530.
2
6
4
8
10
x
Density of the chi-square distribution with 2, 3, 5 dJ.
Derivation of (16). This is similar to the derivation of (3) and (10). We choose a
number l' between 0 and 1 and determine c] and C2 from Table AIO, App. 5, such that
[see (15)]
Subtraction yields
Transforming
CI
~ Y ~
C2
with Y given by (17) into an inequality for u 2 , we obtain
n - 1 2
---s
C2
By inserting the observed value
S2
~u
2
n - I 2
~---s.
CI
of S2 we obtain (16).
•
Confidence Intervals for Parameters
of Other Distributions
The methods in Tables 25.1-25.3 for confidence intervals for J.1- and u 2 are designed for
the normal distribution. We now show that they can also be applied to other distrihutions
if we use large samples.
We know that if Xl' ... , Xn are independent random variables with the same mean JL
and the same valiance u 2 , then their sum Yn = Xl + ... + X" has the following properties.
(A) Yn has the mean nJL and the variance nu 2 (by Theorems I and 3 in Sec. 24.9).
(B) If those variables are normal, then Yn is normal (by Theorem 1).
If those random variables are not normal, then (B) is not applicable. However, for large
n the random variable Yn is still approximately normal. This follows from the central limit
theorem, which is one of the most fundamental results in probability theory.
SEC. 25.3
1057
Confidence Intervals
Central Limit Theorem
THEOREM 4
Let Xl> ... , X", ... be independent random variables that have the same
distribution function and therefore the same mean /-L and the same variance a 2 . Let
Yn = Xl + ... + Xn . Then the random variable
(18)
is asymptotically normal with mean 0 and variance 1; that is. the distribution
function Fn(X) of Zn satisfies
lim Fn(x)
n~oo
=
I
-yI2;
I
<I>(x) = - -
x
e-
u2/2
duo
-00
A proof can be found in Ref. [03] listed in App. 1.
Hence when applying Tables 25.1-25.3 to a non normal distribution, we must use
sllfficiently large samples. As a rule of thumb, if the sample indicates that the skewness
of the distribution (the asymmetry; see Team Project 16(d), Problem Set 24.6) is small,
use at least Il = 20 for the mean and at least n = 50 for the variance.
_...-.. .....
_ ••• w ..... • _ _ .... _
11-71
... ---
....... ..-. • ..-.
--.
...,
MEAN (VARIANCE KNOWN)
1. Find a 95% confidence interval for the mean JL of a
normal population with standard deviation 4.00 from
the sample 30. 42, 40, 34, 48, 50.
2. Does the interval in Prob. I gel longer or shorter if we
take 'Y = 0.99 instead of 0.95? By what factor?
3. By what factor does the length of the interval in Prob. 1
change if we double the sample size?
4. Find a 90% confidence interval for the mean JL of a
nonnal population with variance 0.25, using a sample
of 100 values with mean 212.3.
5. What sample size would be needed for obtaining a 95%
confidence interval (3) of length 2u? Of length u"?
18-121
MEAN (VARIANCE UNKNOWN)
Fwd a 99% confidence interval for the mean of a nonnal
popUlation from the sample:
8. 425, 420. 425, 435
9. Length of 20 bolts with sample mean 20.2 cm and
sample variance 0.04 cm2
10. Knoop hardness of diamond 9500, 9800, 9750, 9200,
9400, 9550
11. Copper content (%) of brass 66. 66. 65. 64. 66. 67, 64.
65,63,64
12. Melting point eC) of aluminum 660, 667. 654, 663, 662
6. (Use of Fig. 525) Find a 95% confidence interval for
a sample of 200 values with mean 120 from a normal
distribution with variance 4, lIsing Fig. 525.
13. Find a 95% confidence interval for the percentage of
cars on a certain highway that have poorly adjusted
brakes. using a random sample of 500 cars stopped at
a roadblock on that highway. 87 of which had poorly
adjusted brakes.
7. What sample size i~ needed to obtain a 99% confidence
interval of length 2.0 for the mean of a normal
population with variance 25? Use Fig. 525. Check by
calculation.
14. Find a 99% confidence interval for p in the binomial
distribution from a classical result by K. Pearson, who
in 24000 trials oftossing a coin obtained 12012 Heads.
Do you think that the coin was fair?
CHAP. 25
1058
Mathematical Statistics
[ii-20 I VARIANCE
Find a Y5% confidence interval for the variance of a normal
population from the sample:
15. A sample of 30 values with variance 0.0007
16. The sample in Prob. 9
17. The sample in Prob. II
18. Carbon monoxide emission (grams per mile) of a
certain type of passenger car (cruising at 55 mph):
17.3,17.8,18.0,17.7,18.2,17.4. 17.6. 18.1
19. Mean energy (keV) of delayed neutron group (Group
3, half-life 6.2 sec.) for uranium U235 fission: 435, 451,
430,444,438
20. Ultimate tensile strength (k psi) of alloy steel
(Maraging H) at room temperature: 251, 255, 258, 253,
253,252,250,252,255,256
21. If X is normal with mean 27 and variance 16, what
distributions do -X, 3X, and 5X - 2 have?
22. If Xl and X2 are independent normal random variables
25.4
with mean 23 and 4 and variance 3 and I, respectively,
what distribution does 4X1 - X2 have? Hint. Use Team
Project 14(g) in Sec. 24.8.
23. A machine fills boxes weighing Y lb with X lb of salt,
where X and Yare normal with mean 100lb and 51b
and standard deviation 1 lb and 0.5 lb, respectively.
What percent of filled boxes weighing between 104 Ib
and 1061b are to be expected?
24. If the weight X of bags of cement is normally
distributed with a mean of 40 kg and a standard
deviation of 2 kg, how many bags can a delivery truck
carry so that the probability of the total load exceeding
2000 kg will be 5%?
25. CAS EXPERIMENT. Confidence lntervals. Obtain
100 samples of size 10 of the standardized normal
distribution. Calculate from them and graph the
corresponding 95o/{' confidence intervals for the mean
and count how many of them do not contain O. Does
the result suppon the theory? Repeat the whole
experiment. compare and comment.
Testing of Hypotheses.
Decisions
The ideas of confidence intervals and of tests 2 are the two most important ideas in modern
statistics. In a statistical test we make inference from sample to population through testing
a hypothesis, resulting from experience or observations, from a theory or a quality
requirement, and so on. In many cases the result of a test is used as a basis for a decision,
for instance, to buy (or not to buy) a certain model of car, depending on a test of the fuel
efficiency (miles/gal) (and other tests, of course), to apply some medication, depending
on a test of its effect; to proceed with a marketing strategy, depending on a test of consumer
reactions, etc.
Let us explain such a test in terms of a typical example and introduce the corresponding
standard notions of statistical testing.
E X AMP L E 1
Test of a Hypothesis. Alternative. Significance Level a
We want to buy 100 coils of a certain kind of wire, provided we can verify the manufacturer's claim that the
wire has a breaking limit /-t = /-to = 200 Ih (or more). This is a test ofthe hypothesis [also called /lull hypothesis}
/-t = /-to = 200. We shall not buy the wire if the (statistical) test shows that actually /-t = /-t1 < /-to, the wire is
weaker, the claim does not hold. /-tl is called the alternative (or alternative iz)lJOtizesis) of the test. We shall
accept the hypothesis if the test suggests that it is true, except for a small error probability a, called the
significance level of the test. Otherwise we reject the hypothesis. Hence a is the probability of rejecting a
hypothesis although it is hue. The choice of a is up to us. SCk and I % are popular values.
For the test we need a sample. We randomly select 25 coils of the wire, cut a piece from each coil, and
determine the breaking limit experimentally. Suppose that this sample of n ~ 25 values of the breaking limit
has the mean x = 1971b (somewhat less than the claim!) and the standard dcviation s = 6 lb.
2Beginning around 1930, a systematic theory of tests was developed by NEYMAN (see Sec. 25.3) and EGON
SHARPE PEARSON (1895-1980), English statistician, the son of Karl Pearson (see the footnote on p. 1066).
SEC. 25.4
Testing of Hypotheses.
1059
Decisions
At this point we could only speculate whether this difference 197 - 200 = - 3 is due to randomness, is a
chance effect. or whether it is significant, due to the actually inferior quality of the wire To continue beyond
,peculation requires probability theory. as follows.
We assume thaI the breaking limit is normall} distributed. (This as,umption could be tested by the method
in Sec. 25.7. Or we could remember the central limit theorem (Sec. 25.3) and take a still larger sample.) Then
T=
x-
/Lo
SI\ ';;
in (II). Sec. 25.3, with JL = /Lo has a t-distribution with 1/ - 1 degrees of freedom (1/ - I = 24 for our sample).
Also X = 197 and s = 6 are observed values of X and S to be used later. We can now choose a significance
level, say, a = 5%. From Table A9 in App. 5 or from a CAS we then obtain a critical value c such that
peT ~ c) = a = 5%. For PIT ~ c) = I - a = 95'it the table gives c = 1.71. so that c = -c = -1.71 because
of the symmetry of the distribution (Fig. 531).
We now reason as follows-this is the crucial idea of the test. If the hypothesis is true, we have a chance of
only a (= 5'it) that we observe a value t of T (calculated from a sample) that will fall between -:x; and -1.71.
Hence if we nevertheless do observe such a t, we assert that the hypothesis cannot be true and we reject it. Then
wc accept the alternative. If. however. t ~ c. wc accept the hypothesis.
A simple calculation finally give~ t = (197 - 200)/(6rV25) = -2.5 as an observed value of T. Since
-2.5 < -1.71. we reject the hypothesis (the manufacturer's claim) and accept the alternative /L = /Ll < 200,
•
the wire ~eems to be weaker than claimed.
I
Reject hypotheSiS~;
Do not reject hypothesis
:
95%
a~~~
c =-1.71
0
Fig. 531.
t-distribution in Example 1
This example illustrates the steps of a test:
1. Formulate the hypothesis {}
=
80 to be tested. (80 = /-Lo in the example.)
2. Formulate an alternative 8 = 8}. (81 = /-Ll in the example.)
3. Choose a significance level a (5%,1%,0.1%).
e
4. Use a random variable
= g(Xl , . . . , Kn) whose distribution depends on the
hypothesis and on the alternative, and this distribution is known in both cases. Determine
assuming the hypothesis to be true. (In the
a critical ':.alue c from the distribution of
example. e = T. and c is. obtained from peT ~ c) = a.)
e,
e
of e.
Accept or reject the hypothesis. depending on the size of erelative to c. (t < c in
5. Use a sample Xl • . . . , Xn to determine an observed value
= g(x}, ... , xn)
(t in the example.)
6.
the example, rejection of the hypothesis.)
Two important facts require further discussion and careful attention. The first is the
choice of an alternative. In the example, /-L] < /-Lo, but other applications may require
/-Ll > /-Lo or /-Ll
/-Lo· The second fact has to do with errors. We know that a (the
significance level of the test) is the probability of rejecting a true hypothesis. And we
shall discuss the probability {3 of accepting afalse hypothesis.
'*
1060
CHAP. 25
Mathematical Statistics
One-Sided and Two-Sided Alternatives (Fig. 532)
Let 8 be an unknown parameter in a distribution, and suppose tbat we want to test the
hypothesis 8 = 80 , Then there are three main kinds of alternatives. namely,
(1 )
(2)
(3)
(1) and (2) are one-sided alternatives, and (3) is a two-sided alternative.
We call rejection region (or critical region) the region such that we reject the
hypothesis if the observed value in the test falls in this region. In CD the critical c lies to
the right of 80 because so does the alternative. Hence the rejection region extends to the
right. This is called a right-sided test. In @ the critical c lies to the left of 80 (as in
Example 1), the rejection region extends to the left, and we have a left-sided test
(Fig. 532, middle part). These are one-sided tests. In @ we have two rejection regions.
This is called a two-sided test (Fig. 532, lower part).
All three kinds of alternatives occur in practical problems. For example, (1) may arise
if 80 is the maximum tolerable inaccuracy of a voltmeter or some other instrument.
Alternative (2) may occur in testing strength of material, as in Example 1. Finally, 80 in
(3) may be the diameter of axle-shafts, and shafts that are too thin or too thick are equally
undesirable, so that we have to watch for deviations in both directions.
Acceptance Region
Do not reject hypothesis
(Accept hypothesis)
Rejection Region
(Critical Region)
Reject hypothesis
c
Acceptance Region
Do not reject hypothesis
(Accept hypothesis)
Rejection Region
(Critical Region)
Reject hypothesis
~ __________________ ~----~I-----------------------80
c
Rejection Region
(Critical Region)
Reject hypothesis
Acceptance Region
Do not reject
hypothesis
(Accept hypothesis)
0------ -----
Re jectlon Region
(Critical Region)
Reject hypothesIs
80
Fig. 532. Test in the case of alternative (1) (upper part of the figure), alternative
(2) (middle part), and alternative (3)
Errors in Tests
Tests always involve risks of making false decisions:
(I) Rejecting a true hypothesis (Type 1 error).
ll' = Probability of making a Type I error.
(II) Accepting a false hypothesis (Type II error).
f3 = Probability of making a Type II error.
SEC 25.4
Testing of Hypotheses.
1061
Decisions
Clearly, we Cannot avoid these errors because no absolutely certain conclusions about
populations can be drawn from samples. But we show that there are ways and means of
choosing suitable levels of risks, that is, of values a and {3. The choice of a depends on
the nature of the problem (e.g., a small risk a = 1% is used if it is a matter of life or
death).
Let us discuss this systematically for a test of a hypothesis ti = tio against an alternative
that is a single number el , for simplicity. We let el > eo, so that we have a right-sided
test. For a left-sided or a two-sided test the discussion is quite similar.
We choose a critical c > eo (as in the upper part of Fig. 532, by methods discussed
below). From a given sample Xl> ••• , Xn we then compute a value
with a suitable g (whose choice will be a main point of our further disc~ssion; for instance,
take g = (Xl + ... + Xn)l11 in the case in which e is the mean). If e > c, we reject the
hypothesis. If ~ c, we accept it. Here, the value can be regarded as an observed value
of the random variable
e
e
(4)
becam,e Xj may be regarded as an observed value of Xj' j = L ... , 11. In this test there
are two possibilities of making an error, as follows.
Type I Error (see Table 25.4). The hypothesis is true but is rejected (hence the
alternative is accepted) because e assumes a value > c. Obviously, the probability of
making such an error equals
e
(5)
a is called the significance level of the test, as mentioned before.
Type II Error (see Table 25.4). The hypothesis is false but is accepted because
assumes a value ~ c. The probability of making such an error is denoted by {3; thus
e
e
(6)
7] = I - {3 is called the power of the test. Obviously, the power TJ is the probability of
avoiding a Type I1 error.
Table 25.4 Type I and Type II Errors in Testing a Hypothesis
()o Against an Alternative ()
()J
() =
=
Unknown Truth
'0
Cl)
E.
'-'
'-'
e = eo
'-'
<C
e = el
e = eo
e = el
True decision
Type II error
P=l-a
P=f3
Type 1 error
True decision
P=I-{3
P=a
1062
CHAP. 25
Mathematical Statistics
Formulas (5) and (6) show that both (]' and f3 depend on c, and we would like to choose
c so that these probabilities of making errors are as small as possible. But the important
Figure 533 shows that these are conflicting requirements because to let (]' decrease we
must shift c to the right, but then f3 increases. In practice we first choose (]' (5%, sometimes
1%), then determine c, and finally compute f3. If f3 is large so that the power 1] = 1 - f3
is small, we should repeat the test, choosing a larger sample, for reasons that will appear
shortly.
"
Density of e if
the hypothesis
is true
:
&
Density of if
the alternative
/
l;f
,,. . . . -T-.. . . . . . . is true
//
"
//
I
I
"
...
I
I
I
I
"
...
"
a-::::----...!
eo
........ _
"
el
c
Acceptance region ~ Rejection region (Critical regIOn)
Fig. 533.
Illustration of Type I and II errors in testing a hypothesis
e = eo against an alternative e = e, (> eo, right-sided test)
If the alternative is not a single number but is of the form (1}-(3), then f3 becomes a
function of e. This function f3( e) is called the operating characteristic (OC) of the test
and its curve the OC curve. Clearly, in this case 1] = 1 - f3 also depends on e. This
function 1](e) is called the power function of the test. (Examples will follow.)
Of course, from a test that leads to the acceptance of a certain hypothesis Bo, it does
not follow that this is the only possible hypothesis or the best possible hypothesis. Hence
the terms "not reject" or "fail to reject" are perhaps better than the term "accept."
Test for IL of the Normal Distribution with Known u 2
The following example explains the three kinds of hypotheses.
E X AMP L E 2
Test for the Mean of the Normal Distribution with Known Variance
Let X be a normal random variable with variance a 2 = 9. Using a sample of size
hypothesis /L = /Lo = 24 against the three kinds of alternatives. namely.
(a)
Solution.
/L> /Lo
(b)
/L < /Lo
/L
(c)
'*
11
= 10 with mcan.Y, test the
/Lo'
We choose the significance level a = 0.05. An estimate of the mean will be obtained from
I
X = - (Xl
+ ... +
X17)'
11
If the hypothesis is true. X is normal with mean /L = 24 and variance
Hence we may obtain the critical value (' from Table A8 in App. 5.
Case (a).
(T
2
Right-Sided Test. We determine e from PIX > e)fL~24 = a
-
P(X ;" C)fL~24 = <P
(e-24)
vo:9
/n = 0.9. see Theorem I, Sec. 25.3.
=
0.05, that is.
= I - a = 0.95.
Table A~ in App. 5 _gives (e - 24)/VO.9 = 1.645, and e = 25.56. which is grcater than /Lo, as in the upper
part of FIg. 532. If x ;" 25.56, the hypothesis is accepted. If.ct' > 25.56, it is rejected. The power function of
the test is (Fig. 534)
SEC. 25.4
Testing of Hypotheses.
Decisions
1063
0.8
0.6
O.Ll
0.2
20
Fig. 534.
> 25.56)" = I -
(7)
= I - <P (
P(X ~ 25.56)"
25.56 - fL)
vo.9
= I - <P(26.94 - 1.05fL}
0.9
Left-Sided Test. The critical value c is obtained from the equation
-
PiX ~
C),,-24
= <P
(cYo.9
-
24)
Table A8 in App. 5 yields c = 24 - 1.56 = 22.44. If x
reject it. Ihe power function of the test is
(8)
Case (c).
fL
Power function 1)(fL) in Example 2, case (a) (dashed) and case (c)
7}(fL) = P(X
Case (b).
28
fLO
7}(fL)
-
= P(X
~
22.44)" = <1>
~
~
= a = 0.05.
22.44. we accept the hypothesis. If x < 22.44. we
( 22.44 - fL )
~
= <1>(23.65
VO.9
Two-Sided Test. Since the normal distribution is symmetric. we choose
+ k. and detennine k from
cl
and
c2
equidistant from
fL = 24. say. cl = 24 - k and c2 = 24
P(24 - k
~ X ~ 24 + k), ,~24 =
<1>(
k
VO.9
) -
<p(-
k
VO.9
)
= I - a = 0.95.
Table A8 in App. 5 gives k/Y0.9 = 1.960. hence k = 1.86. This gives the values cl = 24 - 1.86 = 22.14
and c2 = 24 + 1.86 = 25.86. If x is not smaller than cl and not greater than c2. we accept the hypothesis.
Otherwise we reject it. The power function of the test is (Fig. 534)
7}(fL}
(9)
=
P(X
< 22.14)" +
P(X
=1+<1> (
= I
+
> 25.86)" =
22.14 - fL)
Yo.9
P(X
-<P
< 22.14)" + I -
P(X ~ 25.86)"
(25.86 - fL)
VO.9
<1>(23.34 - 1.05fL) - <P(27.26 - 1.05fL).
Consequently. the operating characteristic f3(.fL) = I - 7}(fL) (see before i is (Fig. 535)
If we take a larger sample. ~ay. of size 11 = 100 (instead of 10). then a 2 /1l = 0.09 (instead of 0.9) and the
critical values are Cl = 23.41 and c2 = 24.59, as can be readily verified. Ihen the operating characteristic of
the test is
f3(fL) =
<p( 24.59 VO.09
=
fL) _ <1>( 23.41
. fL)
V'0.09
<P(81.97 - 3.33fL) - <P(78.03 - 3.33fL}.
1064
CHAP. 25
Mathematical Statistics
Figure 535 shows that the corresponding OC curve is steeper than that for 11 = 10. l11is means that the increase
of 11 has led to an improvement of the test. In any practical case, 11 is chosen as small as possible but SO large that
the test brings out deviations bet",een J-L and J.Lo that are of practical interest. For instance. if deviations of ±2 units
are of interest. we see from Fig. 535 that 11 = 10 is much too small because when J-L = 24 - 2 = 22 or J-L = 24
+ 2 = 26 f3 is almost 50%. On the other hand, we see that" = I 00 is sufticient for that purpose.
•
(3(p.)
1.0
\
\
0.8
0.6
0.4
n = 10
\
0.2
n= 100
,
....... 6-
28 p.
26
P.o
Curves of the operating characteristic (OC curves) in
Example 2, case (e), for two different sample sizes n
Fig. 535.
Test for J.L When u 2 is Unknown, and for u 2
E X AMP L E 3
Test for the Mean of the Normal Distribution with Unknown Variance
The tensile strength of a sample of /I = 16 manila ropes (diameter 3 in.) was measured. The sample mean was
4482 kg, and the sample standard deviation was s = 115 kg (N. C. Wiley, 41st Annual Meeting of the
American Society for Testing Materials). Assuming that the tensile strength is a normal random variable, test
the hypothesis J.Lo = 4500)..g against the alternative J-LI = 4400 kg. Here J.Lo may be a value given by the
manufacturer. while J-Ll may result from previous experience.
x=
We choose the significance level a = S%. If the hypothesis i~ true, it follows from Theorem 2 in
Sec. 25.3, that the random variable
Solution.
T=
x-
x-
J-Lo
Sry';;
4500
S/4
has at-distribution ",ith 11 - I = IS d.f. The test is left-sided. TIle critical value c is obtained from
peT < c)I'O = a = 0.05. Table A9 in App. 5 gives c = -1.7S. As an observed value of T we obtain from the
sample t = (4482 - 4S00)/( 11S/4) = -0.626. We see that t > c and accept the hypothesis. For obtaining
numeric values of the power of the test. we would need tables called noncentral Student t-tables: we shall not
discuss this question here.
•
E X AMP L E 4
Test for the Variance of the Normal Distribution
Using a sanlple of size n = 15 and '<lmple varilmce s2 = 13 from a normal population, test the hypothe,is
2
2
2
2
u = Uo = 10 against the alternative u = Ul = 20.
Solution.
We choose the significance level a = 5%. If the hypothesis is true, then
S2
y=
(/I -
S2
1)(J',2
1410 = lAS
2
o
has a chi-square distribution with n - I = 14 d.f. by Theorem 3, Sec. 2S.3. From
P(Y> c)
= a = O.OS.
that is.
PlY ~ c) = 0.9S,
and Table AIO in App. 5 with 14 degrees of freedom we obtain c = 23.68. This is the critical value of Y. Hence
SEC. 25.4
Testing of Hypotheses.
Decisions
1065
to S2 = Uo 2Y1(11 - 1) = 0.7I4Y there corresponds the critical value c* = 0.714' 23.68 = 16.91. Since
< c*. we accept the hypothesis.
If the alternative is true, the random variable Y1 = l4S2/U12 = 0.7S2 has a chi-square distribution with 14
dJ. Hence our test has the power
.1'2
From a more extensive table of the chi-square distribution (e.g. in Ref. [G3] or [G8]) or from your CAS, you
see that 7J = 62%. Hence the Type II risk is very large. namely. 38%. To make this risk smaller. we would
have to increase the sample size.
•
Comparison of Means and Variances
EXAMPLE 5
Comparison of the Means of Two Normal Distributions
Using a sample Xl> ••. , x n , from a normal distribution with unknown mean /L,- and a sample Yl, ... , Yn from
2
another normal distribution with unknown mean /Ly, we want to test the hypothesis that the means are equal,
3
/Lx = /Ly, against an altemative, say, /Lx > /Ly. The variances need not be known but are assumed to be equal.
Two cases of comparing means are of practical importance:
The samples have the same size. Furthermore, each value of the first sample corre~ponds to precisely
one value of the otlzer. because conesponding values result from the same person or thing (paired comparison)for example, two measurements of the same thing by two different methods or two measurements from the two
eyes of the same person. More generally, they may result from pairs of similar individuals or things, for example,
identical twins, pairs of used front tires from the same car, etc. Then we should form the differences of
conesponding values and test the hypothesis that the population conesponding to the differences has mean 0,
using the method in Example 3. If we have a choice. this method is better than the following.
Case A.
Case B.
The tl\'O samples are indepel1dent and not necessarily of the same size. Then we may proceed as
follows. Suppose that tbe altemative is /Lx > /Ly. We choose a significance level a. Then we compute the sample
means X and y as well as (nl - I)s,< 2 and (n2 - l)sy 2, where '\·x2 and Sy 2 are the sample variances. Using Table
A9 in App. 5 with 111 + n2 - 2 degrees of freedom. we now determine (" from
peT ~ c) = I-a.
(10)
We finally compute
(11)
to =
nln2(111
+ n2
"1
+ 112
x-y
- 2)
V(nl
It can be shown that this is an observed value of a random variable that has a t-distribution with nl + 112 - 2
degrees of freedom, provided the hypothesis is true. If to ~ c, the hypothesis is accepted. If to > G, it is rejected.
If the alternative is /Lx
/Ly' then (] 0) must be replaced by
*
(10*)
peT ~
Gl)
=
0.5a,
peT ~ c2)
= I - O.Sa.
Note that for sanlples of equal size "1 = n2 = n, formula (11) reduces to
(12)
3 This assumption of equality of variances can be tested, as shown in the next example. If the test shows that
they differ significantly, choose two samples of tbe same size nl = n2 = n (not too small, > 30, say), use tbe
test in Example 2 together with the fact that (12) is an observed value of an approximately stillldardized normal
random variable.
1066
CHAP. 25
Mathematical Statistics
To illustrate the computations. let us consider the two samples
105
lOR
89
92
(xl' . . . • .lnl )
and (.'"1 ....• -""2) given by
103
103
107
124
105
97
103
107
III
97
and
84
showing the relative output of tin plate worker~ under two different working conditions p. J. B. Worth. JUllrnol
ofJndll.<triol Engineering 9. 2--19-253). Assuming that the conesponding populations are normal and have the
smne variance. let us test the hypothesis iJ-:c = iJ-y against the alternative iJ-x
iJ-y' (Equality of varimlCes will
be tested in the next example.)
*'
Solution.
We find
y
.1' = 105.125.
~
sx2 = 106.125.
97.500.
Sy
2 = 84.000.
We choose the significance level a = 5'k. From (10") with 0.5a = 2.5'if.. 1 - O.5a = 97.5'k and Table A9 in
App.5 with 14 degrees of freedom we obtain ("1 = -2.1--1 and ("2 = 2.14. Fonnula (12) with n = 8 giYes the
value
to =
v'8. 7.625/\ '"i9o.i25 =
1.56.
Since ("1 ~ to ~ ("2' we accept the hypothesis iJ-x = iJ-y that under both conditions the mean output is the same.
Case A applies to the example becau~e the two first sample values conespond to a certain type of work. the
next two were obtained in another kind of work. etc. So we may use the differences
16
16
6
o
o
13
8
of corresponding sample values and the method in Example 3 to test the hypothesis,.,. = O. where,.,. is the memt
of the population cOITcsponding to the differences. As a logical alternativc we take,.,. O. The sample mean is
d = 7.625. and the sample variance is -,2 = --15.696. Hence
*'
t
= v'8 (7.625 - O)/V --15.696 = 3.19.
From P( T ~ ("1) = 2.5'k. P( T ~ ("2) = 97.5'7c and Table A9 in App. 5 with 11 - I = 7 degrees of freedom we
obtain ("1 = - 2.36. ("2 = 2.36 and reject the hypothesis because t = 3.19 does not lie between ("1 and ("2' Hence
our pre~ent test. in which we used more information (but the ,ame samples). shows that the difference in output
is ~ignificant.
•
E X AMP L E 6
Comparison of the Variance of Two Normal Distributions
Using the fWO ~amples in the last example. test the hypothesis u.t .2 = U y 2: a"ume thm the corresponding
populations are normal and the nature of the experiment suggests the altemative u x 2 > u y 2 .
Sollttioll.
We find
-'x2
=
106.125,
8--1.000. We choose the signiticance level a = 5'k. Using
I. n2 - 1) = (7. 7) degrees of freedom. we
= s:r2/Sy2 = 1.26. Since Vo ~ (". we accept the hypothesis. If
Sy2 =
p( V ~ (") = I - a = 95'if. and Table All in App. 5. with (n1 -
determine (" = 3.79. We finally compute Vo
> (". we would reject it.
This test is justified by the fact that Vo i~ an ob,erved value of a random variable that ha, a ,o-called
F-distribution with (n1 - 1.112 - I) degrees of freedom. provided the hypothesis is true. (Proof in Ref. IG3]
li,ted in App. I.J The F-distribution with (111. n) degree, of freedom was introduced by R. A. Fisher4 and has
the distribution function F(:::) = 0 if : < 0 and
Vo
(13)
F(:::) = Kmn
f
z
,<'n-2)/2(l1It
+ n)-<11t+n)/2
dt
(:::
~
0).
o
•
4 After the pioncering work of the English statistician and biOlogist. KARL PEARSON (1857-1936). the
founder of the English school of statistics. and WILLIAM SEALY GOSSET (]876-1937). who discovered the
t-di,tribution (mtd published under the name ·Student"). the English statistician Sir RONALD AYLMER FISHER
(1890-1962). professor of eugenic~ in London (1933-1943) and professor of genetics in Cambrid"e Enuland
(19~3:-1957) and Adelaide, Australia (1957-1962), had great influence on the ~fllfther developmen~;f m~dern
statIstics.
SEC 25.4
Testing of Hypotheses.
1067
Decisions
This long section contained the basic ideas and concepts of testing, along with typical
applications and you may perhaps want to review it quickly before going on, because the
next sections concern an adaption of these ideas to tasks of great practical importance and
resulting tests in connection with quality control, acceptance (or rejection) of goods
produced. and so on.
...
-..
_..
1. Test JL = a against JL > 0, assuming normality and
using the sample 1. -\. I. 3. -8. 6. a (deviations of
the azimuth [multiples of 0.01 radian] in some
revolution of a satellite). Choose a = 5'k.
2. In one of his classical experiments Buffon obtained
2048 heads in tossing a coin 4040 times. Was the coin
fair'?
3. Do the same test as in Prob. 2. using a result by
K. Pearson. who obtained 6 019 heads in 12000 trials.
4. Assuming normality and known variance u 2 = 4. test
the hypothesis fL = 30.0 again~t the alternative (a)
JL = 28.5. (b) JL = 30.7. using a sample of size 10 with
mean x = 28.5 and choosing a = 5%.
5. How does the result in Prob. 4(a) change if we use a
smaller sample. say. of size 4. the other data (.ct = 28.5,
a = 5%. etc.) remaining as before?
6. Detemine the power of the test in Prob. 4(a).
7. What is the rejection region in Prob. 4 in the case of a
two-sided test with a = 5'k?
8. Using the sample 0.80. 0.81, 0.81. 0.82, 0.81. 0.82,
0.80,0.82. 0.81. 0.81 (length of nails in inches), test
the hypothesis JL = 0.80 in. (the length indicated on
the box) against the alternative JL
0.80 in. (A"sume
normality. choose a = 5%.)
*'
9. A firm sells oil in cans containing 1000 g oil per can
and is interested to know whether the mean weight
differs significantly from 1000 g at the 5% level. in
which case the filling machine has to be adjusted. Set
up a hypothesis and an alternative and perform the test.
assuming normality and using a sample of 20 fillings
with mean 996 g and standard deviation 5 g.
10. If a sample of 50 tires of a certain kind has a mean life
of 32 000 mi and a standard deviation of 4000 mi. can
the manufacturer claim that the true mean life of such
tires is greater than 30000 mi? Set up and test a
corresponding hypothesis at a 5% level, assuming
normality.
11. If simultaneous measurements of electric voltage by
two different types of voltmeter yield the differences
(in volts) 0.8. 0.2, -0.3.0.1. 0.0. 0.5, 0.7. 0.2, can we
assert at the 5% level that there is no Significant
difference in the calibration of the two types of
instruments'! (Assume normality.)
12. If a standard medication cures about 70% of patients
with a certain disease and a new medication cured 148
of the first 200 patients on whom it was hied, can we
conclude that the new medication is better? (Choose
a = 5%.)
13. Suppose that in the past the standard deviation of
weights of certain 25.0-oz packages tilled by a machine
was 0.4 oz. Test the hypothesiS Ho: u = 0.4 against
the alternative HI: U> 0.4 (an undesirable increase).
using a sample of 10 packages with ~tandard deviation
0.507 and assuming normality. (Choose a = 5%.)
14. Suppose that in operating battery-powered electrical
equipment, it is less expensive to replace all batteries
at fixed intervals than to replace each battery
individually when it breaks down, provided the
standard deviation of the lifetime is less than a certain
limit. say. less than 5 hours. Set up and apply a suitable
test. using a sample of 28 values of lifetime~ with
standard deviation s = 3.5 hours and assuming
normality: choose a = 5%.
15. Brand A gasoline was used in 9 automobiles of the
same model under identical conditions. The
corresponding sample of 9 values (miles per gallon)
had mean 20.2 and standard deviation 0.5. Under the
same conditions. high-power brand B gasoline gave a
sample of 10 values with mean 21.8 and standard
deviation 0.6. Is the mileage of B significantly better
than that of A'? (Test at the 5% level; assume
normality.)
16. The two samples 70. 80. 30. 70. 60. 80 and 140. 120,
130. 120. 120. 130. 120 are values of the differences
of temperatures (OC) of iron at two stages of casting.
taken from two different crucibles. Is the variance of
the first population larger than that of the second?
(Assume normality. Choose a = 5%.)
17. Using samples of sizes 10 and 16 with variances
2
2
Sx = 50 and Sy = 30 and assuming normality of the
corresponding popUlations, test the hypothesis
Ho: a./ = u y2 against the alternative u/ > u y2 .
Choose a = 5%.
18. Assuming normality and equal varIance and usmg
independent samples with 1/1 = 9, .r = 12 . .\'x = 2,
1/2 = 9, Y = 15, Sy = 2. test Ho: JLx = JL y against
JLx
fLy; choose a = 5%.
*'
1068
CHAP. 25
Mathematical Statistics
19. Show that for a nonnal distribution the two types of
en'ors in a test of a hypothesis Ho: IL = JLo against an
alternative HI: IL = ILl can be made as small as one
pleases (not zero) by taking the sample sufficiently large.
20. CAS EXPERIMENT. Tests of Means and
Variances. (a) Obtain 100 samples of size 10 each from
25.5
the normal distribution with mean 100 and variance 25.
For each sample test the hypothesis ILo = 100 against
the alternative ILl > 100 at the level of a = 10% Record
the number of rejections of the hypothesis. Do the whole
experiment once more and compare.
(b) Set up a similar experiment for the variance of a
normal distribution and perform it 100 times.
Quality Control
The ideas on testing can be adapted and extended in various ways to serve basic practical
needs in engineering and other fields. We show this in the remaining sections for some
of the most important tash; solvahle by statistical methods. As a first such area of
problems, we discuss industrial quality control, a highly successful method used in
various industries.
No production process is so perfect that all the products are completely alike. There
is always a small variation that is caused by a great number of small. uncontrollable
factors and must therefore be regarded as a chance variation. It is important to make
sure that the products have required values (for example, length. strength, or whatever
property may be essential in a particular case). For this purpose one makes a test of the
hypothesis that the products have the required property. say. fL = fLo, where fLo is a
required value. If this is done after an entire lot has been produced (for example, a lot
of 100 000 screws), the test will tell us how good or how bad the products are, but it
it obviously too late to alter undesirable results. It is much better to test during the
production run. This is done at regular intervals of time (for example, every hour or
half-hour) and is called quality control. Each time a sample of the same size is taken,
in practice 3 to 10 times. If the hypothesis is rejected. we stop the production and look
for the cause of the trouble.
If we stop the production process even though it is progressing properly, we make a
Type I error. If we do not stop the process even though something is not in order. we
make a Type II error (see Sec. 25.4). The result of each test is marked in graphical form
on what is called a control chart. This was proposed by W. A. Shew hart in 1924 and
makes quality cuntrol particularly effective.
Control Chart for the Mean
An illustration and example of a control chart is given in the upper part of Fig. 536. This
control chart for the mean shows the lower control limit LCL, the center control line
CL, and the upper control limit UCL. The two control limits correspond to the critical
values Cl and C 2 in case (c) of Example 2 in Sec. 25.4. As soon as a silmple mean falls
outside the range between the control limits, we reject the hypothesis and assert that the
production process is "out of control"; that is, we assert that there has been a shift in
process level. Action is called for whenever a point exceeds the limits.
If we choose control limits that are too loose, we shall not detect process shifts. On the
other hand, if we choose control limits that are too tight, we shall be unable to run the
process because of frequent searches for nonexistent trouble. The usual significance level
SEC 25.5
Quality Control
1069
is Q' = 1%. From Theorem I in Sec. 25.3 and Table A8 in App. 5 we see that in the case
of the normal distribution the corresponding control limits for the mean are
(1)
0"
LCL = lLo - 2.58 ~!
UCL
'
=
lLo
Vll
+
0"
2.58 ---;= .
\'11
Here 0" is assumed to be known. If 0" is unknown, we may compute the standard deviations
of the first 20 or 30 samples and take their arithmetic mean as an approximation of 0".
The broken line connecting the means in Fig. 536 is merely to display the results.
Additional, more subtle controls are often used in industry. For instance, one observes
the motions of the sample means above and below the centerline, which should happen
frequently. Accordingly, long runs (conventionally of length 7 or more) of means all above
(or all below) the centerline could indicate trouble.
4.20
~
I
I
I
0.5%
ue'
4.15
I
c
~
:::;;:
4.10
/
~
,Y
II
1\
CL
f"
'f
I
I
l
99%
lel_{
4.05
\
0.5%
/
[
)
4.00
Sample no.
10
5
0.04
0.0365
I
I I .!
I
1%
II
UCL-
/
0.03
I
c
a
:;::;
'"
.;;
1
W
"0
"Cl
ro
"Cl
0.02
0.01
Sample no.
't.\
1\
I 1\II
c
'"
ill
\
I
\
\
99 %
\
1/
o
5
10
Control charts for the mean (upper part of figure) and
the standard deviation in the case of the samples on p. 1070
Fig. 536.
1070
CHAP. 25
Mathematical Statistics
Table 25.5 Twelve Samples of Five Values Each
(Diameter of Small Cylinders, Measured in Millimeters)
Sample
Number
-
Sample Values
I
2
3
4
5
6
7
8
9
10
II
12
x
s
R
4.06
4.10
4.06
4.06
4.08
4.08
4.10
4.06
4.08
4.10
4.08
4.12
4.08
4.08
4.12
4.08
4.12
4.10
4.10
4.12
4.10
4.12
4.12
4.12
4.12
4.080
4.112
4.084
4.088
4.108
0.014
0.011
0.026
0.023
0.018
0.04
0.02
0.06
0.06
0.04
-1-.08
4.06
4.08
4.06
4.06
4.10
4.08
4.08
4.08
4.08
4.10
4.08
4.10
4.10
4.10
4.10
4.10
4.10
4.12
4.12
4.12
4.12
4.12
4.14
4.16
4.100
4.088
4.096
4.100
4.104
0.014
0.023
0.017
0.032
0.038
0.04
0.06
0.04
0.08
0.10
4.12
4.14
4.14
4.14
4.14
4.16
4.14
4.16
4.16
4.16
4.140
4.152
0.014
0.011
0.04
0.02
Control Chart for the Variance
In addition to the mean, one often controls the variance. the standard deviation, or the
range. To set up a control chart for the variance in the case of a normal distribution, we
may employ the method in Example 4 of Sec. 25.4 for detennining control limits. It is
customary to use only one control limit. namely. an upper control limit. Now from Example
4 of Sec. 25.4 we have S2 = uo2 Y/(n - I), where because of our normality assumption
the random variable Y hao.; a chi-square distribution with 11 - 1 degrees of freedom. Hence
the desired control limit is
(2)
UCL
=
n-I
where (' is obtained from the equation
P(Y> c) = a,
that is,
P(Y
~
c)
=
I - a
and the table of the chi-square distribution (Table AIO in App. 5) with 11 - 1 degrees of
freedom (or from your CAS); here a (51k or I st. say) is the probability that in a properly
running process an observed value S2 of S2 is greater than the upper control limit.
If we wanted a control chart for the variance with both an upper control limit UCL and
a lower control limit LCL. these Iimi[s would be
u 2c1
LCL= _ _
(3)
where
(4)
1/-1
Cl
and
C2
and
UCL=
are obtained from Table A I 0 with
and
11 -
P(Y
I d.f. and the equations
~ C2)
=
a
I - -.
2
SEC. 25.5
Quality Control
1071
Control Chart for the Standard Deviation
To set up a control chart for the standard deviation, we need an upper control limit
uVc
UCL=
(5)
Vn"=l
obtained from (2). For example, in Table 25.5 we have 11 = 5. Assuming that the
corresponding population is normal with standard deviation u
0.02 and choosing
a = I %, we obtain from the equation
P(Y
~
c)
= I -
a
= 99%
and Table A lOin App. 5 with 4 degrees of freedom the critical value c
(5) the corresponding value
UCL=
0.02vrns
V4
= 13.28 and from
= 0.0365,
which is shown in the lower part of Fig. 536.
A control chart for the standard deviation with both an upper and a lower control limit
is obtained from (3).
Control Chart for the Range
Instead of the variance or standard deviation, one often controls the range R (= largest
sample value minus smallest sample value). It can be shown that in the case of the normal
distribution, the standard deviation u is proportional to the expectation of the random
variable R* for which R is an observed value, say, u = AnE(R*), where the factor of
proportionality An depends on the sample size 11 and has the values
II
An
=
uIE(R*)
11
2
3
4
5
6
7
8
9
10
0.89
0.59
0.49
0.43
0.40
0.37
0.35
0.34
0.32
12
14
16
18
20
30
40
50
0.31
0.29
0.28
0.28
0.27
0.25
0.23
0.22
Since R depends on two sample values only, it gives less information about a sample
than s does. Clearly, the larger the sample size 11 is, the more information we lose in using
R instead of s. A practical rule is to use s when 11 is larger than 10.
I. Suppose a machine for filling cans with lubricating oil
is set so that it will generate fillings which form a
normal population with mean I gal and standard
deviation 0.03 gal. Set up a conrrol chart of the type
shown in Fig. 536 for controlling the mean (that is. find
LCL and VCL). a~suming that the sample size is 6.
2. (Three-sigma control chart) Show that in Prob. I, the
requirement of the significance level a = 0.3% leads
to LCL = J.L - 3ulY;; and VCL = J.L + 3ulY;;, and
find the corresponding numeric values.
3. What sample size should we choose in Prob. 1 if we
want LCL and VCL somewhat closer together. say.
VCL - LCL = 0.05. without changing the significance
level'!
CHAP. 25
1072
Mathematical Statistics
4. How does the meaning of the control limits ( I ) change
if we apply a control chart with these limits in the case
of a population that is not normal?
5. How should we change the sample size in controlling the
mean of a normal population if we want the difference
UCL - LCL
to decrease to half its original value?
6. What LCL and UCL should we use instead of (1) if
instead of x we use the sum Xl + ... + xn of the
sample values? Detennine these limits in the case of
Fig. 536.
7. Ten samples of size 2 were taken from a production
lot of bolts. The values (length in mm) are as shown.
Assuming that the population is normal with mean 27.5
and variance 0.024 and using (I ), set up a control chart
for the mean and graph the sample means on the chart.
Sample
Length
4
3
2
No.
5
7
6
8
9
10
27.4 27.4 '17.5 27.3 27.9 27.6 27.6 27.8 27.5 27.3
14. How would progressive tool wear in an automatic lathe
operation be indicated by a control chart of the mean?
Answer the same question for a sudden change in the
position of the tool In that operation.
15. (Number of defects per unit) A so-called c-chart or
defects-per-unit chart is used for the control of the
number X of defects per unit (for instance, the number
of defects per 10 meters of paper. the number of
missing rivets in an airplane wing, etc.) (a) Set up
formulas for CL and LCL, UCL corresponding to
J.L ::':: 3u.
assuming that X has a Poisson distribution. (b) Compute
CL, LCL, and UCL in a control process ofthe number
of imperfections in sheet glass; assume that this number
is 2.5 per sheet on the average when the process is
under control.
16. (Attribute control charts). Twenty samples of size
100 were taken from a production of containers. The
numbers of defectives (leaking containers) in those
samples (in the order observed) were
27.6 27.4 '17.7 27.4 27.5 27.5 27.4 27.3 27.4 27.7
8. Graph the means of the following 10 samples
(thickness of washers. coded values) on a control chart
for means, assuming that the population is normal with
mean 5 and standard deviation 1.55.
Time
8:008:309:00 9:30 10:00 10:30 11:00 11:3012:00 12:30
13
Sample 4
Values
18
.4
3
6
6
5
2
5
8
6
7
5
4
4
7
3
6
5
4
4
3
6
5
6
4
6
6
4
6
4
5
5
6
4
5
2
5
3
9. Graph the ranges of the samples in Prob. 8 on a control
chart for ranges.
10. What effect on UCL - LCL does it have if we double
the sample size? Ifwe switch from £l' = 1% to £l' = 5%?
11. Since the presence of a point outside control limits for
the mean indicates trouble ("the process is out of
contro]"), how often would we be making the mistake
oflooking for nonexistent trouble if we used (a) I-sigma
limits. (b) 2-sigma limits? (Assume normality.)
12. Graph An = uIE(R*) as a function of 11. Why is
monotone decreasing function of 11?
An a
13. (Number of defectives) Find formulas for the UCL,
CL, and LCL (corresponding to 3u-limits) in the case
of a control charI for the number of defectives,
assuming that in a state of statistical control the fraction
of defectives is p.
376
45497056134
902
12
8.
From previous experience it was known that the
average fraction defective is p = 5% provided that
the process of production is mnning properly. Using
the binomial distribution. ~et up afmction defectil'e chart
(also called a p-chart). that is. choose the LCL = 0
and determine the UCL for the fraction defective (in
percent) by the use of 3-sigma limits. where u 2 is the
variance of the random variable
X
=
Fractioll defectil'e ill a sample of size 100.
Is the process under control?
17. CAS PROJECT. Control Charts. (a) Obtain 100
samples of 4 values each from the normal distribution
with mean 8.0 and variance 0.16 and their means.
variances, and ranges.
(b) Use these samples for making up a control chart
for the mean.
(c) Use them on a control chart for the standard
deviation.
(d) Make up a control chart for the range.
(e) Describe quantitative properties of the samples
that you can see from those charts (e.g., whether the
corresponding process is under control, whether the
quantities observed vary randomly, etc.).
SEC. 25.6
1073
Acceptance Sampling
25.6
Acceptance Sampling
Acceptance sampling is usually done when products leave the factory (or in some cases
even within the factory). The standard situation in acceptance sampling is that a producer
supplies to a consumer (a buyer or wholesaler) a lot of N items (a carton of screws. for
instance). The decision to accept or reject the lot is made by determining the number x
of defectives (= defective items) in a sample of size 11 from the lot. The lot is accepted
if x ~ c, where c is called the acceptance number, giving the allowable number of
defectives. If x > c, the consumer rejects the lot. Clearly, producer and consumer must
agree on a ce11ain sampling plan giving 11 and c.
From the hypergeometric distribution we see that the event A: "Accept the lot" has
probability (see Sec. 24.7)
c
(1)
P(A) = P(X ~ c) =
2:
x=o
where M is the number of defectives in a lot of N items. In terms of the fraction defective
e = MIN we can write (1) as
(2)
P(A; e) can assume n + 1 values conesponding to e = 0, liN, 21N, ... , NIN; here, n
and c are fixed. A monotone smooth curve through these points is called the operating
characteristic curve (OC curve) of the sampling plan considered.
E X AMP L E 1
Sampling plan
Suppose that certain tool bits are packaged 20 to a box. and the following sampling plan is u~ed. A sample of
two tool bits is drawn, and the corresponding box is accepted if and only if both bits in the sample are good.
In Ihis case, N = 20, 11 = 2, C = O. and (2) t<lies the form la factor 2 drops out)
(20 - 208)(19 - 200)
380
The values of peA. 0) for 8 = O. 1120.2120. ...• 20/20 and the re,ulting OC curve are shown in Fig. 537 on
p. 1074. (Verify!)
•
In most practical cases e will be small (less than 10%). Then if we take small samples
compared to N, we can approximate (2) by the Poisson distribution (Sec. 24.7); thus
(3)
(J.-t
= ne).
1074
CHAP. 25
p(A;e)
Mathematical Statistics
0.5
p(Ae) 0.5
0.2
Ii
Fig. 537.
E X AMP L E 2
Ii
OC curve of the sampling plan with n = 2
and c = 0 for lots of size N = 20
Fig. 538.
OC curve in Example 2
Sampling Plan. Poisson Distribution
Suppose that for large lots the following sampling plan is used. A sample of size II = 20 is taken. If it contains
not more than one defective. the lot is accepted. If the sample contains two or more defectives. the lot is rejected.
In this plan, we obtain from (3)
e- 2o o(1 + 20/J).
peA: /J) -
•
The corresponding OC curve is shown in Fig. 538.
Errors in Acceptance Sampling
We -;how how acceptance sampling fits into general test theory (Sec. 25.4) and what this
means from a practical point of view. The producer wants the probability a of rejecting
an acceptable lot (a lot for which 6 does not exceed a certain number 60 on which the
two pm1ies agree) to be small. 60 is called the acceptable quality level (AQL). Similarly,
P(A:Ii)
95%
S]~
\
Producer's risk
a = 5°
50%
\
15%
:;ollsumer's risk
{3= 15°1-.
-1"--------I
I
o 60
iiI
= 1%
= 5%
Good : Indifference ' Poor
material,
zone
: material
Fig. 539.
OC curve, producer's and consumer's risks
SEC. 25.6
1075
Acceptance Sampling
the consumer (the buyer) wants the probability f3 of accepting an unacceptable lot (a lot
for which e is greater than or equal to some e1 ) to be small. e1 is called the lot tolerance
percent defective (LTPD) or the rejectable quality level (RQL). a is called producer's
risk. It corresponds to a Type [ error in Sec. 25.4. f3 is called consumer's risk and
corresponds to a Type II error. Figure 539 shows an example. We see that the points
(eo, I - a) and (e1 • f3) lie on the OC curve. It can be shown that for large lots we can
choose eo, el (> eo), a, f3 and then determine 11 and c such that the OC curve runs very
close to those prescribed points. Table 25.6 shows the analogy between acceptance
sampling and hypothesis testing in Sec. 25.4.
Table 25.6
Acceptance Sampling and Hypothesis Testing
Acceptance Sampling
Hypothesis Testing
---- . - - - - ----+-------=-------------1
Acceptable quality level (AQL) e = 80
Hypothesis tI = tlo
Lot tolerance percent defectives (LTPD)
Alternative 8 = til
8 = 81
Allowable number of defectives c
Critical value c
Producer·s risk a of rejecting a lot
Probability a of making a Type Terror
(significance level)
with 8 ~ 80
Consumer's risk {3 of accepting a lot
Probability {3 of making a Type II error
with 8 ~ 81
Rectification
Rectification of a rejected lot means that the lot is inspected item by item and all defectives
are removed and replaced by nondefective items. (This may be too expensive if the lot is
cheap; in this case the lot may be sold at a cut-rate price or scrapped.) If a production
turns out 100e% defectives, then in K lots of size N each, KN8 of the KN items are
defectives. Now KP(A; 8) of these lots are accepted. These contain KPNe defectives,
whereas the rejected and rectified lots contain no defectives, because of the rectification.
Hence after the rectification the fraction defective in all K lots equals KPNeIKN. This is
called the average outgoing quality (AOQ); thus
(4)
AOQ(e) = ep(A; e).
\
OC curve
\
0.5
\
AOQL.-~
0/"
o
Fig. 540.
..f
e*
..
0.5
~
e
OC curve and AOQ curve for the sampling plan in Fig. 537
CHAP. 25
1076
Mathematical Statistics
Figure 540 on p. 1075 shows an example. Since AOQ(O) = 0 and peA; 1) = 0, the AOQ
curve has a maximum at some = 8*, giving the average outgoing quality limit (AOQL).
This is the worst average quality that may be expected to be accepted under rectification.
e
--_....---........ -...........
..
-~
......
.-..
1. Lots of knives are inspected by a sampling plan that
uses a sample of size 20 and the acceptance number
c = I. What are probabilitIes of accepting a lot with
1%, 2%, 10% defectives (dull blades)? Use Table A6
in App. 5. Graph the OC curve.
2. What happens in Prob. I if the sample size is increased
to 50? First guess. Then calculate. Graph the OC curve
and compare.
3. How will the probabilities in Prob. I with 11 = 20 change
(up or down) if we decrease c to zero? First guess.
compare with Example I.
11. Samples of 5 screws are drawn from a lot with fraction
defective O. The lot is accepted if the sample contains
(a) no defective screws, (b) at most 1 defective screw.
Using the binomial distribution, find, graph, and
compare the OC curves.
12. Find the risks in the single sampling plan with 11 = 5
and c = 0, assuming that the AQL is 00 = I % and the
RQL is 01 = 15%.
4. What are the producer's and consumer's risks in
Prob. I if the AQL is 1.5% and the RQL is 7.5'70?
13. Why is it impossible for an OC curve to have a vertical
portion separating good from poor quality?
5. Large lots of batterie~ are inspected according to the
following plan. 11 = 30 batteries are randomly drawn
from a lot and tested. If this sample contains at most
c = I defective battery, the lot is accepted. Otherwise
it is rejected. Graph the OC curve of the plan, using
the Poisson approximation.
6. Graph the AOQ curve in Prob. 5. Determine the
AOQL, assuming that rectification is applied.
14. If in a single sampling plan for large lots of spark plugs,
the sample size is 100 and we want the AQL to be 5%
and the producer's risk 2%, what acceptance number
c should we choose? (Use the normal approximation.)
7. Do the work required in Prob. 5 if 11 = 50 and c = O.
8. Find the binomial approximation of the hypergeometric
distribution in Example 1 and compare the approximate
and the accurate values.
15. What is the consumer's risk in Prob. 14 if we want the
RQL to be 12%?
16. Graph and compare sampling plans with c = I and
increasing values of II, say, 11 = 2. 3, 4. (Use the
binomial distribution.)
9. In Example 1, what are the producer's and consumer's
risks if the AQL is 0.1 and the RQL is 0.6?
17. Samples of 3 fuses are drawn from lots and a lot is
accepted if in the corresponding sample we find no
more than I defective fuse. Criticize this sampling plan.
In particular, find the probability of accepting a lot that
is 509r defective. (Use the binomial distribution.)
10. Calculate peA; 0) in Example 1 if the sample size is
increased from 11 = 2 to 11 = 3, the other data remaining
as before. Compute peA; 0.10) and peA; 0.20) and
18. Graph the OC curve and [he AOQ curve for the single
sampling plan for large lots with 11 = 5 and c = 0, and
find the AOQL.
25.7
Goodness of Fit.
To test for goodness of fit means that we wish to test that a certain function F(x) is the
distribution function of a distribution from which we have a sample Xl' . • . ,xn . Then we
test whether the sample distribution function F(x) defined by
F(x)
=
SUIIl
of the relative frequencies of all sample l'lllues xJ not exceedbza
x
b
fits F(x) "sufficiently well." If this is so, we shall accept the hypothesis that F(x) is the
distribution function of the population; if not, we shall reject the hypothesis.
SEC. 25.7
Goodness of Fit.
x 2 -Test
1077
This test is of considerable practical importance, and it differs in character from the
tests for parameters (IL. a 2 • etc.) considered so far.
To test in that fashion, we have to know how much F(x) can differ from F(x) if the
hypothesis is true. Hence we must first introduce a quantity that measures the deviation
of F(x) from F(x), and we must know the probability distribution of this quantity under
the assumption that the hypothesis is true. Then we proceed as follows. We determine a
number c such that if the hypothesis is true, a deviation greater than c has a small
preassigned probability. If, nevertheless, a deviation greater than c occurs, we have reason
to doubt that the hypothesis is true and we reject it. On the other hand. if the deviation
does not exceed c, so that F(x) approximates F(x) sufficiently well. we accept the
hypothesis. Of course. if we accept the hypothesis, this means that we have insufficient
evidence to reject it, and this does not exclude the possibility that there are other functions
that would not be rejected in the test. In this respect the situation is quite similar to that
in Sec. 25.4.
Table 25.7 shows a test of that type, which was introduced by R. A. Fisher. This test
is justified by the fact that if the hypothesis is true, then Xo 2 is an observed value of a
random variable whose distribution function approaches that of the chi-square distribution
with K - I degrees of freedom (or K - r - 1 degrees of freedom if r parameters are
estimated) as 11 approaches infinity. The requirement that at least five sample values lie
in each interval in Table 25.7 results from the fact that for finite 11 that random variable
has only approximately a chi-square distribution. A proof can be found in Ref. [G3] listed
in App. 1. If the sample is so small that the requirement cannot be satisfied. one may
continue with the test, but then use the result with caution.
Table 25.7 Chi-square Test for the Hypothesis That F(x) is the Distribution Fundion
of a Population from Which a Sample XlJ ••• , Xn is Taken
Step 1. Subdivide the x-axis into K intecvab 110 12 ,
. . . , IK such thm each interval contains
at least 5 values of the given smuple Xl, . . . , x n . Determine the number bj of smupJe
values in the interval I j , where j = I .... , K. If a sample value lies at a common
boundary point of two intervals, add 0.5 to each of the two corresponding b;r
Step 2. U:-ing F(x), compute the probability Pj that the random variable X under
consideration assumes any value in the interval I j , where j = 1, ... , K. Compute
ej
= IIPj.
(This is the number of sample values theoretically expected in I j if the hypothesis
is true.)
Step 3. Compute the deviation
(1)
Step 4. Choose a significance level (5%. I %. or the like).
Step 5. Determine the solution c of the equation
P(X 2 ~ c)
=
I -
Q'
from the table of the chi-sqare distribution with K - I degrees of fi'eedom (Table
AIO in App. 5). If rparameters of F(x) are unknown and their maximum likelihood
estimates (Sec. 25.2) are used. then use K - r - I degrees of freedom (instead
2
of K - I). If Xo ~ c, accept the hypothesis. If Xo 2 > c, reject the hypothesis.
1078
CHAP. 25
Mathematical Statistics
Table 25.8 Sample of 100 Values of the Splitting Tensile Strength (Ib/in?)
of Concrete Cylinders
320
350
370
320
400
420
390
360
370
340
380
340
390
350
360
400
330
390
400
360
340
350
390
360
350
350
360
350
360
390
410
360
440
340
390
370
380
370
350
400
340
350
390
350
350
320
330
350
380
410
380
370
330
340
400
330
350
370
380
370
350
370
360
390
340
380
300
370
340
400
360
380
330
350
360
390
360
390
360
360
320
300
400
380
370
400
360
370
330
340
370
420
370
340
420
370
360
340
370
360
D. L. IVEY. Splitting tensile tests on structural lightweight aggregate concrete. Texas Transp0l1ation
Institute. College Station. Texas.
EXAMPLE 1
Test of Normality
Test whether the population from which the sample in Table 25.8
wa~
taken is normal.
SolutiOIl. Table 25.8 show~ the values (column by column) in the order obtained in the experiment. Table
25.9 gives the frequency distribution and Fig. 541 the histogram. It is hard to guess the outcome of the
test-does the histogram resemble a normal density curve sufficiently well or not?
The maximum likelihood estimates for IL and cr2 are jL = X = 364.7 and ;;2 = 712.9. The computation in
Table 25.10 yields Xo2 = 2.942. It is very intereMing that the interval 375 ... 385 contributes over 501ft of
X02. From the histogram we see that the corresponding frequency looks much too small. The second largest
contribution comes from 395 ... 405. and the histogram shows that the frequency seems somewhat too large.
which is perhaps not obvious from inspection.
Table 25.9
Tensile
Strength
x
[lb/in. 2 ]
300
310
320
330
340
Frequency Table of the Sample in Table 25.8
2
Absolute
Frequency
3
4
5
Relative
Frequency
Cumulative
Absolute
Frequency
Cumulative
Relative
Freguency
FlX)
lex)
2
o
4
6
II
350
360
370
380
390
14
16
15
8
400
410
420
430
440
8
2
3
10
o
1
0.02
0.00
0.04
0.06
0.11
2
6
12
23
0.02
0.02
0.06
0.12
0.23
0.14
0.16
0.15
0.08
0.10
37
53
68
76
86
0.37
0.53
0.68
0.76
0.86
0.08
0.02
0.03
0.00
0.01
94
96
99
99
100
0.94
0.96
0.99
0.99
2
LOO
SEC. 25.7
1079
Goodness of Fit. x2-Test
0.20,--------,---,----...,-----,
0.15
((x)
0.10
0.05
o~
250
__~~~~~--~~~~
350
2
[lb.lin. ]
Fig. 541.
Frequency histogram of the sample in Table 25.8
We choose a = 5%. Since K = 10 and we e5timated r = 2 parameters we have to usc Table AlO in App. 5
with K - ,. - I = 7 degrees of freedom. We find c = 14.07 as the solution of P(X 2 ~ c) = 95%. Since
2
Xo < c, we accept the hypothesis that the population is normal.
•
Table 25.10
Computations in Example 1
Xj -
Xj
26.7
ej
hj
Ternl in (1)
0.0000 ... 0.0681
6.81
6
0.0%
0.0681 ... 0.1335
6.54
6
0.045
-1.11 ... -0.74
0.1335 ... 0.2296
0.2296 ... 0.3594
0.3594 ... 0.4960
9.61
12.98
13.66
11
14
16
0.201
0.080
0.401
0.4960 ... 0.6517
0.6517 ... 0.7764
15.57
12.47
15
8
0.021
1.602
0.7764 ... 0.8708
9.44
10
0.033
0.8708 ... 0.9345
0.9345 . . . 1.0000
6.37
8
0.417
6.55
6
0.046
335 ... 345
345 ... 355
355 ... 365
385 ... 395
364.7)
26.7
-1.49 ... - 1.11
-x
395· .. 405
405 ... co
Xj -
... -1.49
-:x;···325
325 ... 335
365 ... 375
375 ... 385
¢(
364.7
-0.74' .. -0.36
-0.36· .
0.01
0.01 ... 0.39
0.39' .
0.76
0.76" . 1.13
1.13 ... 1.51
1.51 ..
co
X02 =
1. If 100 flips of a coin result in 30 heads and 70 tails.
can we assel1 on the 5% level that the coin is fair?
2. If in 10 flips of a coin we get the same ratio as in
Prob. I (3 heads and 7 tails), is the conclusion the same
as in Prob. I? First conjecture. then compute.
3. What would be the smallest number of heads in
Prob. I under which the hypothesis "Fair coin" is still
accepted (with ex = 5%)?
4. If in rolling a die 180 times we get 39. 22. 41. 26. 20,
32. can we claim on the 5% level that the die is fair?
2.942
5. Solve Prob. 4 if the sample is 25, 31. 33, 27, 29. 35.
6. A manufacturer claims that in a process of producing
kitchen knives, only 2.5% of the knives are dull. Test
the claim against the alternative that more than 2.5%
of the knives are dull, using a sample of 400 knives
containing 17 dull ones. (Use ex = 5%.)
7. Between 1 P.M. and 2 P.M. on five consecutive days
(Monday through Friday) a certain service station has
92,60. 66. 62. and 90 customers, respectively. Test the
hypothesis that the expected number of customers during
that hour is the same on lhose days. (Use ex = 591:.)
CHAP. 25
1080
Mathematical Statistics
8. Test for normality at the I % level using a sample of
/I = 79 (rounded) values x (tensile strength [kg/mm2] of
steel sheets of 0.3 nun thickness). a = a(x) = absolute
frequency. (Take the first two values together, also the
last three, to get K = 5.)
58
59
60
61
62
63
10
17
27
8
9
3
64
17.
9. In a sample of 100 patients having a certain disease 45
are men and 55 women. Does this support the claim
that the disease is equally common among men and
women? Choose a = 5%.
10. In Prob. 9 find the smallest number (>50) of women
that leads to the rejection ofthe hypothesis on the levels
5%, 1%, 0.5%.
11. Verify the calculations in Example 1 of the text.
12. Does the random variable X = Number of accide/lTs
per week in 1I certain foundry have a Poisson
distribution if within 50 weeks. 33 were accident-free.
I accident occurred in II of the 50 weeks. 2 in 6 of
the weeks and more than 2 accidents in no week?
(Choose a = 5%.)
13. Using the given sample, test that the corresponding
population has a Poisson distribution. x is the number
of alpha particles per 7.5-sec intervals observed by E.
Rutherford and H. Geiger in one of their classical
experiments in 1910, and a(x) is the absolute frequency
(= number of time periods during which exactly x
particles were observed). (Use a = 5%.)
x
0
a
I 57
t
a
I
16.
2
3
4
5
6
203
383
525
532
408
273
7
8
9
10
II
12
~B
139
45
27
10
4
2
0
18.
19.
load of 5000 lb, can we claim that a new process yields
the same breakage rate if we find that in a sample of
80 rods produced by the new process, 27 rods broke
when subjected to that load? (Use a = 5%.)
Three samples of 200 livets each were taken from a
large production of each of three machines. The
numbers of defective rivets in the samples were 7, 8,
and 12. Is this difference significant? (Use a = 5%.)
In a table of properly rounded function values, even
and odd last decimals should appear about equally
often. Test this for the 90 values of lI(:':) in Table Al
in App. 5.
Are the 5 tellers in a ceI1ain bank equally time-efficient
if during the same time interval on a certain day they
serve 120.95, 110, 108, 102 customers? (Use a = 5%.)
CAS EXPERIMENT. Random Number Generator.
Check your generator expelimentally by imitating
results of 11 trials of rolling a fair die, with a convenient
11 (e.g .. 60 or 300 or the like). Do this many times and
see whether you can notice any "nonrandomness"
features, for example. too few Sixes, too many even
numbers. etc .. or whether your generator ~eems to work
properly. Design and perform other kinds of checks.
20. TEAM PROJECT. Difficulty with Random
Selection. 77 students were asked to choose 3 of the
imegers II, 12, 13, ... ,30 completely arbitrarily. The
amazing result was as follows.
14. Can we assert that the traffic on the three lanes of an
expressway (in one direction) is about the same on each
lane if a count gives 910. 850. 720 cars on the right.
middle. and left lanes, respectively. during a particular
time interval? (Use a = 5%.)
15. If it i5 known that 25% of certain steel rod~ produced
by a standard process will break when subjected to a
'\lumber Il
Frequ.
11
Number 21
rrequ.
12
12
13
14
15
16
17
18
19
20
10 20
8
13
9
21
9
16
8
22 23 24 25 26 27
28
8
15
10
10
9
12
8
29 30
13
9
If the selection were completely random, the following
hypotheses should be true.
(a) The 20 numbers are equally likely.
(b) The 10 even numbers together are as likely as the
10 odd numbers together.
(c) The 6 prime numbers together have probability 0.3
and the 14 other numbers together have probability 0.7.
Te~t these hypotheses. using a = 5%. Design further
experiments that illustrate the difficulties of random
selectiOn.
25.8 Nonparametric Tests
Nonparametric tests, also called distribution-free tests, are valid for any distribution.
Hence they are used in cases when the kind of distribution is unknown, or is known but
such that no tests specifically designed for it are available. In this section we shall explain
the basic idea of these tests, which are based on "order statistics" and are rather simple.
SEC. 25.8
1081
Nonparametric Tests
If there is a choice, then tests designed for a specific distribution generally give better
results than do nonparametric tests. For instance. this applies to the tests in Sec. 25.4 for
the normal distribution.
We shall discuss two tests in terms of typical examples. In deriving the distributions
used in the test, it is essential that the distributions from which we sample are continuous.
(Nonparametric tests can also be derived for discrete distributions, but this is slightly more
complicated. )
E X AMP L E 1
Sign Test for the Median
A median of the population is a solution x ~ Ii of the equation F(x) ~ 0.5. where F is the distribution function
of the population.
Suppose that eight radio operators were tested, first in rooms without air-conditioning and then m
air-conditioned rooms over the same period of time. and the difference of errors (unconditioned minus
conditioned) were
9
o
4
o
4
11.
7
Test the hypothesis Ii ~ 0 (that is, air-conditioning has no effect) against the alternative ji > 0 (that is. inferior
performance in unconditioned rooms).
Solution.
We choose the significance level a ~ 5%. If the hypothesis is true. the probability p of a positive
difference is the same as that of a negative difference. Hence in this case. l' ~ 0.5. and the random variable
x
= NLlmber of positil'e \'a/LIes omollg
11
I'a/ues
has a binomial distribution with p = 0.5. Our sample has eight values. We omit the values O. which do not
contribute to the decision. Then six values are left. all of which are positive. Since
P(X
~ Ii) = (~)
(0.5)6(0.5)0
= 0.0156
=
1.56%
we do have observed an event whose probability is very small if the hypothesis is true: in fact 1.56% < a = 5%.
Hence we assert that the alternative Ii > 0 is true. Thal is. the number of errors made in unconditioned rooms
is significantly higher, so that installation of air conditioning should be considered.
•
E X AMP L E 2
Test for Arbitrary Trend
A certain machine is used for cutting lengths of wire. Five successive piece, had the lengths
29
31
28
30
32.
Using this sample. test the hypothesis that there is no trend, that is. the machine does not have the tendency to
produce longer and longer pieces or shorter and sh0l1er pieces. Assume that the type of machine suggests the
alternative that there is positil'e trend, that is. there is the tendency of successive pieces to get longer.
Solution.
We count the number of transpositions in the sample. that is. the number of times a larger value
precedes a smaller value:
29 precedes 28
(1 transposition),
31 precedes 28 and 30
(2 transpositions).
The remaining three sample values follow in ascendmg order. Hence in the sample there are I
transpositions. We now consider the random valiable
.L
2
3
T = Number of transpositiolls.
If the hypothesis is true (no trend). then each of the 5! = 120 permutations of five elements I 2 3 4 5 has the
same probability (11120). We alTange these permutations according to their number of transpositions:
1082
CHAP. 25
Mathematical Statistics
2
3
T=2
T=I
T=O
4
5
2
2
3
2
3
4
2
3
5 4
3 5
4 5
4 5
2
5
3
5
4
2
2
3
5
3
2
4
2
3
3
4
5
2
1
3
2
2
I
4
3
1
2
3
T=3
4
4
2
3
3
4
4
5
3
4
4
5
5
4
5
5
5
2
2
2
3
2 3
2
3
3
4
3
2
4
5 4 3
5 2
5 2 4
2 5 3
3 2 5
2 3 4
4 5 3
5 3 4 etc.
I 5 4
4
5
I 3 5
2 5 4
4 2 5
4 5
2 3 5
4
From this we obtain
P(T:o:; 3)
= l~O +
lio
+
l~O
+
)1;0
= )2io = 24%.
We accept the hypothesis because we have ob~erved an event that ha, a relatively large probability (certainly
much more than 5'k) if the hypothesis i~ true.
Values of the distribution function of T in the case of no trend are shoINn in Table A12. App. 5. For insrance.
if /I = 3. then FlO) = 0.167. F(I) = 0.500, F(2) = I - 0.167. If /I = 4. then F(O) = 0.042. F(l) = 0.167,
F(2) = 0.375. F(3) = I - 0.375, F(4) = I - 0.167, and SO on.
Our method and those values refer to contilluous distributions. Theoretically. we may then expect that all the
\'alues of a sample arc different. Practically. some sample values may still be equal. because of rounding: If 111
"alues are equal. add m(m - 1)14 (= mean value of thc transpo,itions in the case of the perIl1Ulalion~ of 111
elements). that is.l for each pair of equal values. ~ for each triple. etc.
•
.,
-
1. What would change in Example 1. had we observed
only 5 positive values? Only 4?
2. Does a process of producing plastic pipes of length
f.L = 2 meters need adjustment if in a sample. 4 pipes
have the exact length and 15 are shorter and 3 longer
than 2 meters? (Use the nonnal approximation of the
binomial distribution.)
3. Do the computations in Prob. 2 without the use of the
DeMoivre-Laplace limit theorem (in Sec. 24.8).
4. Test whether a thermostatic switch is properly set to
200 e against the alternative that its setting is too low.
Use a sample of9 values, 8 of which are less than 20 0 e
and I is greater than 20°e.
5. Are air filters of type A better than type B filters if in
IO trials. A gave cleaner air than B in 7 cases, B gave
cleaner air than A in I case. whereas in 2 of the trials
the results for A and B were practically the same?
6. In a clinical experiment. each of 10 patients were given
two different sedatives A and B. The following table
shows the effect (increase of sleeping time. measured
in hours). Using the sign test. find out whether the
difference is significant.
A
B
1.9 0.8
1.I
0.1 -0.1 4.4 5.5 1.6 4.6 3.4
0.7 -1.6 -0.2 -1.2 -0.1 3.4 3.7 0.8 0.0 2.0
Difference 1.2
2.4
1.3
1.3
0.0 1.0 1.8 0.8 4.6 1.4
7. Assuming that the populations corresponding to the
samples in Prob. 6 are nOlmal. apply a suitable test for
the normal distribution.
8. Thirty new employees were grouped into 15 pairs of
similar intelligence and expeIience and were then
instructed in data processing by an old method (A)
applied to one (randomly selected) person of each pair.
and by a new presumably better method (B) applied to
SEC. 25.9
Regression.
1083
Fitting Straight Lines. Correlation
the other person of each pair. Test for equality of
methods against the alternative that (B) is better than
(A), using the following scores obtained after the end
of the training period.
A 60 70 80 85 75 40 70 45 95 80 90 60 80 75 65
I
B 65 85 85 80 95 65 100 60 90 85 100 75 90 60 80
9. Assuming normality. solve Prob. 8 by a suitable test
from Sec. 25.4.
10. Set up a sign test for the lower quartile q25 (defined by
the condition F(q25) = 0.25).
11. How would you proceed in the sign test if the
hypothesis is fi = fio (any number) instead of fi = O?
Temperature T
rCJ
Reading V [volts]
10
30
20
40
50
99.5 101.1 100.4 100.8 101.6
15. In a swine-feeding experiment. the following gains in
weight [kg] of 10 animals (ordered according to
increasing amounts of food given per day) were
recorded:
20
17
19
18
23
16
25
28
24
22.
Test for no trend against positive trend.
16. Apply the test explained in Example 2 to the following
data (x = diastolic blood pressure [mm Hgl. y = weight
of hemt [in grams] of 10 patients who died of cerebral
hemorrhage).
12. Check the table in Example 2 of the text.
A'121 120 95
13. Apply the test in Example 2 to the following data
(x = disulfide content of a certain type of wool,
measured in percent of the content in umeduced fibers;
y = saturation water content of the wool. measured in
percent). Test for no trend against negative trend.
,) 521 465 352 455 490 388 301 395 375 418
y
50
17. Does an increase in temperature cause an increase of
the yield of a chemical reaction from which the
following sample was taken"!
15
30
40
50
55
80
100
Temperature tOe]
10
20
30
40
60
80
46
43
42
36
39
37
33
Yield [kg/min]
O.b
1.1
0.9
1.6
1.2
2.0
14. Test the hypothesis that for a certain type of voltmeter.
readings are independent of temperature T [0C] against
the alternative that they tend to increase with T. Use a
sample of values obtained by applying a constant
voltage:
25.9
123 140 112 92 100 102 91
Regression.
Correlation
18. Does the amount of feltilizer increase the yield of
wheat X [kg/plot]? Use a sample of values ordered
according to increasing amounts of fertilizer:
41.4
43.3
39.6
43.0
44.1
45.6
-l4.5
46.7.
Fitting Straight Lines.
So far we were concerned with random experiments in which we observed a single quantity
(random variable) and got samples whose values were single numbers. In this section we
discuss experiments in which we observe or measure two quantities simultaneously, so
that we get samples of pairs of values (Xl, .\'1), (X2' )'2), ... , (x"' JII). Most applications
involve one of two kinds of experiments, as follows.
1. In regression analysis one of the two variables. call it x. can be regarded as an
ordinary variable because we can measure it without substantial en'or or we can even
give it values we want. x is called the independent variable. or sometimes the
controlled variable because we can control it (set it at values we choose). The other
variable, Y. is a random variable, and we are interested in the dependence of Yon
x. Typical examples are the dependence of the blood pressure Y on the age x of a
person or, as we shall now say, the regression of Yon x. the regression of the gain
of weight Y of certain animals on the daily ration of food x. the regression of the
heat conductivity Y of cork on the specific weight x of the cork. etc.
1084
CHAP. 25
Mathematical Statistics
2. In correlation analysis both quantities are random variables and we are interested
in relations between them. Examples are the relation (one says "correlation") between
wear X and wear Y of the front tires of cars, between grades X and Y of students in
mathematics and in physics, respectively. between the hardness X of steel plates in
the center and the hardness Y near the edges of the plates, etc.
Regression Analysis
In regression analysis the dependence of Y on x is a dependence of the mean /.L of Yon
x, so that /.L = /.L(x) is a function in the ordinary sense. The curve of /.L(x) is called the
regression curve of Y on x.
In this section we discuss the simplest case, namely, that of a straight regression line
(1)
Then we may want to graph the sample values as n points in the xY-plane, fit a straight
line through them, and use it for estimating /.L(x) at values of x that interest us, so that we
know what values of Y we can expect for those x. Fitting that line by eye would not be
good because it would be sUbjective; that is, different persons' results would come out
differently, particularly if the point<; are scattered. So we need a mathematical method that
gives a unique result depending only on the Il points. A widely used procedure is the method
of least squares by Gauss and Legendre. For our task we may fOlmulate it as follows.
Least Squares Principle
The straight line should be fitted through the given points so that the slim of the
squares of the distances of those points from the straight line is minimum, where
tile distance is measured in the vertical direction (the y-direction). (Formulas below.)
To get uniqueness of the straight line, we need some extra condition. To see this, take
the sample (0, I), (0, -I). Then all the lines y = k1x with any kl satisfy the principle.
(Can you see it?) The following assumption will imply uniqueness, as we shall find out.
General Assumption (Al)
The x-values Xl, . . . ,
Xn
in OLlr sample (Xl' Yl), . . . , (Xn, Yn) are not all equal.
From a given sample (Xl. Yl)' ••.. (X". Yn) we shall now determine a straight line by
least ~quares. We write the line as
(2)
and call it the sample regression line because it will be the counterpart of the population
regression line (1).
Now a sample point (Xj, )J) has the vertical distance (distance measured in the
y-direction) from (2) given by
(see Fig. 542).
SEC. 25.9
Regression.
1085
Fitting Straight Lines. Correlation
y
x
x.
J
Fig. 542.
Vertical distance of a point (Xj' Yj) from a straight line Y = ko
+ k,x
Hence the sum of the squares of these distances is
n
(3)
q
= ~
(Yj -
ko - klXj)2.
j~I
In the method ofleast squares we now have to determine ko and kl such that q is minimum.
From calculus we know that a necessary condition for this is
aq = 0
(4)
ilq
and
ilko
ilkl
=
o.
We shall see that from this condition we obtain for the sample regression line the formula
(5)
Here i and.v are the means of the x- and the y-values in our sample, that is,
(a)
j: =
-
I
(Xl
+ ... + xn)
(."1
+ . . . + Yn)·
11
(6)
I
Y=
(b)
11
The slope kl in (5) is called the regression coefficient of the sample and is given by
SXY
(7)
S 2
x
Here the "sample covariance"
1
(8)
and
Sxy
Sx
2
n
= -11 - I
~ (x· J
is
x)(,··
-J
y)
-
=
11 -
J~l
is given by
1
(9a)
Sxy
S 2
x
= -f)-I
n
~
j~I
(Xo J
x)2
=
f)-I
I
1086
CHAP. 25
Mathematical Statistics
:n.
From (5) we see that the sample regression line passes through the point Ct',
by which it
is detennined, together with the regression coefficient (7). We may call Sx 2 the variance of
the x-values, but we should keep in mind that x is an ordinary variable. not a random variable.
We shall soon also need
S 2
Y
(9b)
71
= -1-I ~
~
11 -
j=l
Derivation of (5) and (7).
(y. _ 1')2
-J.
= -I1
Il -
[n
~ 1'.2 ~ -J
j~l
-I
(71
~
n
j=l
)2]
1'.
~-J
.
Differentiating (3) and using (4), we first obtain
iJq
ako
aq
ak
= -2 ~
)'i)j -
ko - k1xj)
=0
l
where we sum over j from 1 to 11. We now divide by 2, write each of the two sums as
three sums, and take the sums containing )j and XjYj over to the right. Then we get the
"normal equations"
= ~ , ..
~.J
(10)
This is a linear system of two equations in the two unknowns ko and k1 . Its coefficient
determinant is [see (9)]
11
and is not zero because of Assumption (A I). Hence the system has a unique solmion.
Dividing the first equation of (10) by 11 and using (6), we get ko = Y - k1x. Together with
y = ko + k1x in (2) this gives (5). To get (7), we solve the system (10) by Cramer's rule
(Sec. 7.6) or elimination, finding
(II)
This gives (7)-(9) and completes the derivation. [The equality of the two expressions in
(8) and in (9) may be shown by the student: see Prob. 14].
•
E X AMP L E 1
Regression Line
The decrease of volume y ['i!-] of leather for certain fixed values of high pres~ure x [atmospheres I was measured.
The resulh are shown in the first mo columns of Table 25.1 L Find the regression line of .'. on x.
Solutioll.
and (X)
We see that
11
= 4 and obtain the values.r = 2800014 = 7000. -" = II}.O/4 = 4.75. and from (9)
SEC. 25.9
Regression.
1087
Fitting Straight Lines. Correlation
Table 25.11 Regression of the Decrease of Volume y [%]
of Leather on the Pressure x [Atmospheres]
I
Given Values
Auxiliary Values
x}
Xj
.v·
J
4000
6000
8000
10000
2.3
4.1
5.7
6.9
16000000
36000000
64000000
100000000
9200
24600
45600
69000
28000
19.0
216000000
148400
2
Sx
.
="3I
~XIJ
(
2
28 000 )
216000000 - - - 4 -
.
_ ~ (
3 148400
-
xjYj
=
20 000 000
3
28000'19) _
4
-
15400
3
.
Hence k] = 15 ·:100120000000 = 0.00077 from (7). and the regression line is
y - 4.75
=
0.000 77(x - 7000)
or
y
=
0.000 77r: - 0.64.
Note that y(O) = -0.64. which is physically meaningless. but typically indicates that a linear relation is merely
an approximation valid 011 some restricted interval.
•
Confidence Intervals in Regression Analysis
If we want to get confidence intervals, we have to make assumptions about the distribution
of Y (which we have not made so far; least squares is a "geometric principle," nowhere
involving probabilities!). We assume normality and independence in sampling:
Assumption (A2)
For each fixed x the random )'Qriable Y is /lonnalwith mean (I), that is,
(12)
and varial/ce
(]"2
independent of x.
Assumption (A3)
The
11
pel.1lJ/7llaIlCeS of the experi1llellt by which we obtain a sample
a re independent.
K1 in (12) is called the regression coefficient of the population because it can be shown
that under Assumptions (AI)-(A3) the maximum likelihood estimate of KI is the sample
regression coefficient kl given by (11).
Under Assumptions (A1)-(A3) we may now obtain a confidence interval for Kb as
shown in Table 25.12.
1088
CHAP. 25
Table 25.12
Mathematical Statistics
Determination of a Confidence Interval for
Kl
in (1) under Assumptions
(Al)-(A3)
Step 1. Choose a confidence level ')'(95%,99%, or the like).
Step 2. Determine the solution c of the equation
(13)
F(c)
~(1
=
+ ')')
from the table of the t-distribution with n - 2 degrees of freedom (Table A9 in
App. 5; 11 = sample size).
Step 3. Using a sample (Xlo Y1),
.•• ,
(xn , Yn), compute (n -
l)sx
2
from (9a), (n -
l)S.TY
from (8), kl from (7),
n
(14)
I )Sy 2 = ~ Yj 2
(n -
-
n
[as in (9b)], and
(15)
Step 4. Compute
K=c
(11 - 2)(n -
I )s,.2 .
The confidence interval is
(16)
E X AMP L E 2
Confidence Interval for the Regression Coefficient
Using the sample in Table 25.1 L determine a confidence interval for
Solution.
Step 1. We choose l'
=
/(1
by the method in Table 25.12.
0.95.
Step 2. Equation (13) takes the form HC) = 0.975, and Table A9 in App. 5 with 11
gives c = 4.30.
-
2 = 2 degrees offreedom
Step 3. From Example I we have 3s",2 = 20000000 and k1 = 0.00077. From Table 25.11 we compute
3s y2 = 102.2 =
192
4
11.95,
qo = I L')5 - 20 (X)O
om . OJ)(J0772
=
0.092.
=
4.30v'0.092/(2 . 20 000 (00)
Step 4. We thus obtain
K
= 0.000206
and
CONFo.95 (0.00056 ~
K1 ~
0.000981.
•
SEC 25.9
Regression.
1089
Fitting Straight Lines. Correlation
Correlation Analysis
We shall now give an introduction to the basic facts in correlation analysis: for proofs see
Ref. [G2J or [G8] in App. I.
Correlation analysis is concerned with the relation between X and Y in a two-dimensional
random variable (X, Y) (Sec. 24.9). A sample consists of 11 ordered pairs of values
(Xl' .\"1)' . . . , (xn , y,,), as before. The interrelation between the \" and y values in the
sample is measured by the sample covariance Sxy in (8) or by the sample correlation
coefficient
(17)
with Sx and Sy given in (IJ). Here r has the advantage that it does not change under a
multiplication of the X and y values by a factor (in going from feet to inches, etc.).
THEOREM 1
Sample Correlation Coefficient
The sample correlation coefficient r sati.~fies - I ~ r ~ 1. In particular. r
and onl.r if the sample values lie on a straight line. (See Fig. 543.)
= :::'::: 1
if
The theoretical counterpart of r is the correlation coefficient p of X and Y,
p=
(18)
where JLx = E(X) , JLy = E(Y), ux2 = E([X - JLxf), Uy2 = E([Y - JLy]2) (the means
and variances of the marginal distributions of X and Y; see Sec. 24.9), and UXy is the
r=l
10
00
••
•
••
•• •
10
00
.~.
543.
• •
•
r = 0.98
••
r=O
10
00
10
00
••
• •
10
•••
••
•
20
r = 0.6
• •
••
•
•
•
•
•
10
20
•
• •
•
•
•
•
20
10
r = -0.3
10
00
• •
•
• •
•
• •
•
•
10
20
r = -0.9
10
00
• •
•
•
10
•• •
•
20
••
Samples with various values of the correlation coefficient r
1090
CHAP. 25
Mathematical Statistics
covariance of X and Y given by (see Sec. 24.9)
(19)
UXY = E([X - J.Lx][Y - J.Ly]) = E(XY) - E(X)E(Y).
The analog of Theorem 1 is
THEOREM 2
Correlation Coefficient
The correlation coefficient p satisfies - 1 ~ P ~ 1. In particular. p = :::': 1
ollly if X alld Yare linearly related, that is. Y = yX + 8. X = y* Y + 8*.
if alld
X and Yare called uncorrelated if p = O.
THEOREM 3
Independence.
Normal Distribution
(a) Indepelldent X and Y (see Sec. 24.9) are uncorrelated.
(b) If (X, Y) is nOl1llal (see below), then uncorrelated X alld Yare
independent.
Here the two-dimensional normal distribution can be introduced by taking two independent
standardized normal random variables X*. Y*, whose joint distribution thus has the density
(20)
f*(x*. y*)
=
_1_
e-<x*2+y*2)/2
27T
(representing a surface of revolution over the x*y*-plane with a bell-shaped curve as cross
section) and setting
X
=
J.Lx
Y = J.Ly
+
+
uxX *
+ ~ uyY*.
pUyX*
This gives the general two-dimensional normal distribution with the density
f(x, y) =
(21a)
1
2 e- h (x,y)/2
27TUXUy~
where
(21b) hex. y)
=
In Theorem 3(b), normality is important, as we can see from the following example.
E X AMP L E 3
Uncorrelated but Dependent Random Variables
If X assumes -1, 0, I with probability 113 and Y = X2. then EO() = 0 and in (3)
__
3
3
1
CTXY - E(XY) - E(X ) ~ (-1) . -
3
+
1
3
0 . 3
1
= 0
3'
+] 3 . -
.
so that p = 0 and X and Yare uncorrelated. But they are cenainly not independent since they are even functlonally
~~
SEC. 25.9
Regression.
Fitting Straight Lines. Correlation
1091
Test for the Correlation Coefficient p
Table 25.13 shows a test for p in the case of the two-dimensional normal distribution.
t is an observed value of a random variable that has a t-distribution with n - 2 degrees
of freedom. This was shown by R. A. Fisher (Biometrika 10 (1915), 507-521).
Table 25.13 Test of the Hypothesis p = 0 Against the Alternative p
of the Two-Dimensional Normal Distribution
> 0 in the Case
Step 1. Choose a significance level a (5%, 1%, or the like).
Step 2. Determine the solution c of the equation
P(T:;;::: c)
I - a
=
from the t-distribution (Table A9 in App. 5) with n - 2 degrees of freedom.
Step 3. Compute r from (17), using a sample
(XIo
Yl), ... , (x", Yn)'
Step 4. Compute
t=r(~).
~~
If t
E X AMP L E 4
~
c, accept the hypothesis. If t > c, reject the hypothesis.
Test for the Correlation Coefficient p
Test the hypothesis p = 0 (independence of X and Y, because of Theorem 3) against the alternative p > 0, using
the data in the lower left corner of Fig. 543. where r = 0.6 (manual soldering errors on 10 two-sided circuit
boards done by 10 workers; x = front, y = back of the boards).
Solution. We choose a = 5%; thus 1 - a = 95%. Since n = 10, n - 2 = 8, the table gives c = 1.86.
Also. t = 0.6VS/0.64 = 2.12 > c. We reject the hypothesis and assert that there is a positive correlation. A
worker making few (many) errors on the front side also tends to make few (many) errors on the reverse side of
the board.
•
11-101
SAMPLE REGRESSION LINE
Find and sketch or graph the sample regression line of Y
and x and the given data as points on the same axes.
1. (-1, 1), (0, 1.7), (1, 3)
2. (3, 3.5), (5, 2), (7, 4.5), (9, 3)
3. (2, 12), (5, 24), (9. 33), (14, 50)
4. (11, 22), (15, 18), e17, 16), (20, 9), (22, 10)
x
6
9
II
13
22
7. x = Revolutions per minute. y
engine [hpJ
5. Speed x [mph] of a car
30
40
50
60
Stopping distance y [tt]
150
195
240
295
Also find the stopping disrance ar 35 mph.
6. x = Deformation of a certain steel [mm], y
hardness [kg/mm 2 ]
26
28
= Brinell
33
35
= Power of a Diesel
x
400
500
600
700
750
y
580
1030
1420
18!m
2100
1092
CHAP. 2S
8. Humidity of air x [%]
Expansion of gelatin y [%]
Mathematical Statistics
10
20
30
40
111-131
0.8
1.6
2.3
2.8
Find a 95% confidence interval for the regression
coefficient Kl, assuming that (A2) and (A3) hold and using
the sample:
9. Voltage x [V]
40
40
80
80
110
110
Current)' [A]
5.1
4.8
10.0
LO.3
13.0
12.7
CONFIDENCE INTERVALS
11. In Prob. 6
Also find the resistance R [il] by Ohms' law
(Sec. 2.9].
10. Force x [Ib]
Extension y [in] of a spring
2
4
4.1
7.8
6
8
12.3 15.8
Also find the spring modulus by Hooke's law
(Sec. 2.4).
~
.. ··1_'.'=01..·
1. What is a sample? Why do we take samples?
2. What is the role of probability theory in statistics?
3. Will you get better results by taking larger samples?
Explain.
4. Do several samples from a certain popUlation have the
same mean? The same variance?
5. What is a parameter? How can we estimate it? Give an
example.
6. What is a statisticaL test? What errors occur in testing?
7. How do we test in quality control?
8. What is the x2-test? Give a simple example hom
memory.
9. What are nonparametric tests? When would you apply
them?
10. In what tests did we use the I-distribution? The
X 2 -distribution?
11. What are one-sided and two-sided tests? Give typical
examples.
12. List some areas of application of statistical tests.
13. What do we mean by "goodness of fiC?
14. Acceptance sampling uses principles of testing. Explain.
15. What is the power of a test? What can you do if the
power is low?
12. In Prob. 7
13. In Prob. 8
14. Derive the second expression for
first one.
2 in (9a) from the
Sx
15. CAS EXPERIMENT. Moving Data. Take a sample,
for instance, that in Prob. 6, and investigate and graph
the effect of changing y-values (a) for small.\', (b) for
large x, (c) in the middle of the sampLe.
S T ION SAN D PRO B L EMS
18. Couldn't we make the error in interval estimation zero
simply by choosing the confidence level I?
19. What is the Least squares principle? Give applications.
20. What is the difference between regression and
cOIl'elation analysis?
21. Find the maximum likelihood estimates of mean and
variance of a normal distribution using the sample 5, 4,
6,5,3,5,7,4,6,5,8,6.
22. Determine a 95% confidence interval for the mean fL of
a normal population with variance 0"2 = 16, using a
sample of size 400 with mean 53.
23. What will happen to the length of the interval in Prob.
22 if we reduce the sample size to 100?
24. Determine a 99% confidence interval for the mean of a
normal population with standard deviation 2.2, using the
sample 28,24,31,27,22.
25. What confidence interval do we obtain in Prob. 24 if
we assume the variance to be unknown?
26. Assuming normality, find a 95% confidence interval for
the variance hom the sample 145.3, 145.1, 145.4, 146.2.
127-291
Find a 95% confidence interval for the mean fL,
assuming normality and using the sample:
27. Nitrogen content [%] of steel 0.74. 0.75. 0.73, 0.75,
0.74.0.72
16. Explain the idea of a maximum likelihood estimate from
memory.
28. Diameters of 10 gaskets with mean 4.37 em and
standard deviation 0.157 cm
17. How does the length of a confidence interval depend on
the sample size? On the confidence level?
29. Density [g/cm 3 J of coke 1.40, 1.45, 1.39, 1.44, 1.38
1093
Summary of Chapter 25
30. What sample size should we use in Prob. 28 if we want
to obtain a confidence interval of length 0.1. assuming that
the standard deviation of the samples is (about) the same?
131-321
Find a 99'1t confidence interval for the variance
2
0- • assuning normality and using the sample:
31. Rockwell hardness of tool bits 64.9. 64.1, 63.8. 64.0
32. A sample of size II = 128 with variance s2 = 1.921
33. Using a sample of IO values with mean 14.5 from a
normal population with variance 0-2 = 0.25. test the
hypothesis flo = 15.0 against the alternative fLl = 14.4
on the 5% level.
34. In Prob. 33. change the alternative to fL =1= 15.0 and test
as before.
35. Find the power in Prob. 33.
36. Using a sample of 15 values with mean 36.2 and
variance 0.9. te~t the hypothesis fLo = 35.0 against the
alternative fLl = 37.0. assuming normality and taking
a = 1%.
37. Using a sample of 20 values with variance 8.25 from a
normal population. test the hyothesis 0-02 = 5.0 against
the alternative 0-1 2 = 8.1. choosing a = 5%.
38. A firm sells paint in cans containing I kg of paint per
can and is interested to know whether the mean weight
differs significantly from I kg, in which case the filling
machine must be adjusted. Set up a hypothesis and an
alternative and perform the test. assuming normality and
using a sample of 20 fillings having a mean of 991 g
and a standard deviation of 8 g. (Choose a = 5%.)
39. Using samples of sizes to and 5 with variances s/ = 50
and Sy2 = 20 and assuming normality of the conesponding
populations. test the hypothesis Ho: 0".,,2 = o-y2 against
the alternative 0".",2 > o-y2. Choose a = 5%.
40. Assume the thickness X of washers to be normal with
mean 2.75 mm and variance 0.00024mm2. Set up a
control chaIt for fL. choosing a = I %, and graph the
means of the five samples (2.74. 2.76). <2.74. 2.74).
(2.79.2.81), (2.78, 2.76), (2.71. 2.75) on the chart.
41. What effect on UCL - LCL in a control chart for the
mean does it have if we double the sample size? If we
switch from a = I % to a = 5'70?
42. The following sample~ of screws (length in inches) were
taken from an ongoing production. Assuming that the
population is normal with mean 3.500 and variance
0.0004, set up a control chart for the mean, choosing
a = I %, and graph the sample means on the chart.
Sample No.
Length
2
3
4
5
6
7
3.49 3.48 3.52 3.50 3.51 3.49 3.52 3.53
3.50 3.47 3.49 3.51 3.48 3.50 3.50 3.49
43. A purchaser checks gaskets by a single sampling plan
that uses a sample size of 40 and an acceptance number
of I. Use Table A6 in App. 5 to compute the probability
of acceptance of lots containing the following
percentages of defective gaskets !'It. !%. I 'It, 2%. 5%.
10%. Graph the OC curve. (Use the Poisson
approximation.)
44. Does an automatic cutter have the tendency of cutting
longer and longer pieces of wire if the lengths of
subsequent pieces [in.] were to.l. 9.8. 9.9, 10.2, 10.6,
to.5?
45. Find the least squares regression line to the data (-2. I).
n. (6. 5).
(0. 1). (2, 3), (4 ...
Mathematical Statistics
We recall from Chap. 24 that with an experiment in which we observe some quantity
(number of defectives. height of persons, etc.) there is associated a random variable
X whose probability distribution is given by a distribution function
(I)
F(x)
=
P(X
~ x)
(Sec. 24.5)
which for each x gives the prObability that X assumes any value not exceeding x.
1094
CHAP. 2S
Mathematical Statistics
In statIstIcs we take random samples Xl, . . . , Xn of size n by performing that
experiment n times (Sec. 25.1) and draw conclusions from properties of samples
about properties of the distribution of the con-esponding X. We do this by calculating
point estimates or confidence intervals or by peIiorming a test for parameters
(/-L and (T2 in the normal distribution. p in the binomial distribution. etc.) or by a
test for distribution functions.
A point estimate (Sec. 25.2) is an approximate value for a parameter in the
distribution of X obtained from a sample. Notably, the sample mean (Sec. 25.1)
1
(2)
X
= -
n
n
2: -'J =
j~l
1
-
n
(Xl
+ ... +
XII)
is an estimate of the mean /-L of X, and the sample variance (Sec. 25.1)
(3)
is an estimate of the variance (T2 of X. Point estimation can be done by the basic
maximum likelihood method (Sec. 25.2).
Confidence intervals (Sec. 25.3) are intervals 81 ~ 8 ~ 82 with endpoints
calculated from a sample such that with a high probability 'Y we obtain an interval
that contains the unknown true value of the parameter 8 in the distribution of X.
Here, 'Y is chosen at the beginning, usually 95% or 99%. We denote such an interval
by CONF y {8l ~ 8 ~ 82 }.
In a test for a parameter we test a h)pothesis 8 = 80 against an alte171ative 8 = 81
and then, on the basis of a sample, accept the hypothesis. or we reject it in favor of
the alternative (Sec. 25.4). Like any conclusion about X from samples, this may
involve en-ors leading to a false decision. There is a small probability a (which we
can choose, 5% or 1%, for instance) that we reject a true hypothesis, and there is a
probability f3 (which we can compute and decrease by taking Larger samples) that
we accept a false hypothesis. a is called the significance level and I - f3 the power
of the test. Among many other engineeIing applications, testing is used in quality
control (Sec. 25.5) and acceptance sampling (Sec. 25.6).
If not merely a parameter but the kind of distribution of X is unknown, we can
use the chi-square test (Sec. 25.7) for testing the hypothesis that some function
F(x) is the unknown distribution function of X. This is done by determining the
discrepancy between F(x) and the distribution function F(x) of a given sample.
"Distribution-free" or nonparametric tests are tests that apply to any distribution,
since they are based on combinatorial ideas. These tests are usually very simple.
Two of them are discussed in Sec. 25.8.
The last section deals with samples of pairs of values, which arise in an
experiment when we simultaneously observe two quantities. In regression analysis,
one of the quantities, x, is an ordinary variable and the other, Y, is a random variable
whose mean /-L depends on x, say, /-L(x) = Ko + KIX, In correlation analysis the
relation between X and Yin a two-dimensional random variable (X, Y) is investigated.
notably in terms of the correlation coefficient p.
I
APPENDIX
1
References
Software see at the beginning of Chaps. 19
and 24.
General References
[GR I] Abramowitz, M. and 1. A. Stegun (eds.), Handbook
of Mathematical Functiol1s. 10th printing, with
corrections. Washington, DC: National Bureau of
Standards. 1972 talso New York: Dover, 1965).
[GR2] Cajori, F., History of Mathematics. 5th ed.
Reprinted. Providence. RI: American Mathematical
Society. 2002.
[GR3] Courant, Rand D. Hilbert, Methods of
Mathematical Physics. 2 vols. Hoboken, NJ: Wiley, 2003.
[GR4] Courant, R, Differelltial and Integral Calculus.
2 vols. Hoboken, NJ: Wiley, 2003.
IGR5] Graham. R L. et aI., Concrete Mathematics. 2nd
ed. Reading. MA: Addison-Wesley. 1994.
IGR6] Ito, K. (ed.), Encyclopedic Dictionmy of
Mathematics. 4 vols. 2nd ed. Cambridge, MA: MIT
Press, 1993.
[GR7] Kreyszig, E., Introductory Functional Analysis
with Applications. New York: Wiley, 1989.
[GR8] Krey~zig, E.. Differemial Geometry. Mineola. NY:
Dover, 1991.
IGR9] Kreyszig. E. Introduction to Diflerentilll Ceometr.V
alld Riemannian CeO/netr". Toronto: University of
Toronto Press, 1975.
IGRlOl Szeg6, G., Orthogonal Polyno/lliaL~. 4th ed.
Replinted. New York: American Mathematical Society,
2003.
[GRII] Thomas, G. et aI., Thomas' Calculus, Early
TranscendelltaL~ Update.
10th ed. Reading, MA:
Addison-Wesley, 2003.
Part A. Ordinary Differential Equations
(ODEs) (Chaps. 1-6)
See also Part E: Numeric Analysis
[AI] Arnold, V. L Ordinm) D(fferential Equations. 3rd
ed. New York: Springer. 1997.
[A2] Bhatia, N. P. and G. P. Szego, Stability Theory of"
Dynamical Systems. New York: Splinger, 2002.
[A3] Birkhofl", G. and G.-c. Rota, Ordinary D(fferential
Equations. 4th ed. New York: Wiley, 1989.
[A4] Brauer. F. and J. A. Nohel, Qualitative Theory of
Ordinary D!f/erelltial Equations. Mineola, NY: Dover,
1994.
[A5] Churchill, R V .. Operatiollal Mathematics. 3rd ed.
New York: McGraw-Hill, 1972.
[A6] Coddington, E. A. and R Carlson, Linear Ordinary
D(fferelltial Equations. Philadelphia: SIAM, 1997.
[A7] Coddington, E. A. and N. Levinson, Theory of
Ordinary Differential Equations. Malabar, FL: Krieger,
1984.
[A8] Dong, T.-R et al., Qualitative Theory ofD(fferential
Equations. Providence, RI: American Mathematical
Society, 1992.
[A9] Erdelyi, A. et aI., Tables of Integral Transfonns.
2 vols. New York: McGraw-Hill, 1954.
IAlO] Hartman. P .. Ordinary Differential Equations. 2nd
ed. Philadelphia: SIAM, 2002.
IAII] lnce, E. L., Ordinary Differential Equations. New
York: Dover, 1956.
[A12] Schiff, J. L., The Laplace Tran.~fonll: Theory and
Applications. New York: Springer, 1999.
[A13] Watson. G. N., A Treatise on the Theory of Bessel
Functions. 2nd ed. Reprinted. New York: Cambridge
University Press, 1995.
[AI4] Widder, D. V., The Laplace Transfol1ll. Princeton,
NJ: Princeton University Press, 1941.
[A 15] Zwillinger, D., Handhook of Differential Equations.
3rd ed. New York: Academic Press, 1997.
Part B. Linear Algebra, Vector Calculus
(Chaps. 7-10)
For books on numeric linear algebra, see also
Part E: Numeric Analysis.
IBI] Bellman, R., Introduction to Matrix Analysis. 2nd
ed. Philadelphia: SIAM, 1997.
IB2] Chatelin, F., Eigenvalues of Matrices. New York:
Wiley-Interscience, 1993.
[B3] Gantmacher. F. R, The Theory of Matrices. 2 vols.
Providence, RI: American Mathematical Society.
2000.
[B4] GOhberg, I. P. et a!., Invariant Subspaces ofMatrices
with Applications. New York: Wiley, 1986.
[B5] Greub, W. H., Linear Algebra. 4th ed. New York:
Springer, 1996.
[B6] Herstein. I. N., Ahstract Algebra. 3rd ed. New York:
Wiley. 1996.
[B7] John, A. W .. Matrices and Tensors in Physics. 3rd
ed. New York: Wiley, 1995.
Al
A2
References
[B8J Lang. S .. Linear Algebra. 3rd ed. New York:
Springer. 1996.
[B9] Nef. W .. Linear Algebra. 2nd ed. New York: Dover.
1988.
[B 10] Parlett. B.. The Symmetric Eigenmlue Problem.
Philadelphia: SIAM. 1997.
Part C. Fourier Analysis and POEs
(Chaps. 10-11)
For books on numerics for PDEs see also Part
[D7] Krantz. S. G .. Complex Analysis: The Geometric
ViCllpoi11t. Washington. DC: The Mathematical
Association of America. 1990.
[D8] Lang, S .. Complex Analysis. 3rd ed. New York:
Springer, 1993.
[D9] Narasimhan, R. Compact Riell/ann SlIIfaces. New
York: Springer, 1996.
[D 10] Nehari. Z.. C01!f017lwl Mapping. Mineola. NY:
Dover. 1975.
[DIl] Springer. G., Introduction to Riemann SlIIfaces.
Providence, RI: American Mathematical Society, 1002.
E: Numeric Analysis.
[CII Antimirov. M. Ya .. Applied Imegral Tramfol1ns.
Providence. RI: American Mathematical Society, 1993.
[Cl] Bracewell, R, Tile Fourier TralJSforlll and Its
Applications. 3rd ed. New York: McGraw-Hili. 2000.
[C3] Carslaw. H. S. and J. C Jaeger, COllduction (~f Heat
in Solids. 2nd ed. Reprinted. Oxford: Clarendon, 1986.
[C4J Churchill. R V. and J. W. Brown, Fourier Series
and BOllndary Value Problellls. 6th ed. New York:
McGraw-HilL 2000.
IC5] DuChateau. P. and D. Zachmann. Applied Partial
Differential Equations. Mineola. NY: Dover. 2001.
[C6] Hanna, J. Rand J. H. Rowland. Fourier Series,
Tran.~fo1711s. and Boundary Value Problems. 2nd ed.
New York: Wiley. 1990.
[C7] leni. A. L The Gibbs Phenomenon ill Fourier
Analysis. Splines. and WCII'e!et Approximations. Boston:
Kluwer. 1998.
[C8] John. E. Partial Differelltial Equatiolls. New York:
Springer. 1995.
[C9] Tolstov. G. P .. Fourier Series. New York: Dover,
1976.
[CIO] Widder. D. V .. The Heat Equation. New York:
Academic Press. 1975.
[CII] Zauderer, E.. Partial Dif(eremial Equatio/lS of
Applied Mathematics. 2nd ed. New York: Wiley, 1989.
[CI2] Zygmund, A. and R Fefferman. Trigonometric
Series. 3rd ed. New York: Cambridge University Press,
2003.
Part
o. Complex Analysis (Chaps. 13-18)
[D I] Ahlfors. L. V.. Complex Analysis. 3rd ed. New
York: McGraw-Hill. 1979.
[D2] Bieberbach. L.. C01!formal Mapping. Providence,
RI: American Mathematical Society. 1000.
[D3] Hemici. P .. Applied and Computational Complex
Analysis. 3 vols. New York: Wiley. 1993.
[D4] Hille, E., Analytic Function Theory. 1 vols. 2nd ed.
Providence. Rl: American Mathematical Society. 1997.
[D5] Knopp. K.. Eleme11ts at tbe Theory of Functions.
Ne\\' York: Dover. 1951.
[D6] Knopp, K .. TheOlY of Functions. 2 parts. New York:
Dover, 1945, 1947.
Part E. Numeric Analysis (Chaps. 19-21)
[EI] Ames, W. E, humerical Methods for Partial
Ditferelltial Equations. 3rd ed. New York: Academic
Press, 1992.
[E2] Anderson, E., et aI., LAPACK User's Guide. 3rd ed.
Philadelphia: SIAM, 1999.
[E3J Bank, R E., PLTMG. A Software Package for
Solring Elliptic Partial Differential Equaticms: Users'
Guide 7.0. Philadelphia: SIAM. 1994.
[E4] Constanda. C, Solution Techniques for Elementary
Partial Differelltial Eqllations. Boca Raton, R: CRe
Press. 2001.
[E5] Dahlquist, G. and A. Bjorck. NUII/erical Methods.
Mineola. NY: Dover. 2003.
[E6] DeBoor. C .. A Practical Guide to Splilles. Reprinted.
New York: Springer. 1991.
[E7] Dongarra. J. J. et aL LlNPACK Users Guide.
Philadelphia: SIAM. 1978. (See also at the beginning of
Chap. 19.)
[E8] Garbow, B. S. et al., Matrix Eige11system Routi11es:
EISPACK Gllide Extensioll. Reprinted. Kew York:
Springer, 1990.
rE9] Golub, G. H. and C F. Van Loan, Matrix
Computatio11s. 3rd ed. Baltimore, MD: Johns Hopkins
University Press, 1996.
[E 10] Higham, N. J., Accuracy a11d Stability o.f Numerical
Algorithms. 2nd ed. Philadelphia: SIAM, 2002.
[E 11] IMSL (International Mathematical and Statistical
Libraries). FORTRAN Numerical Library. Houston, TX:
Visual Numerics, 2001. (See also at the beginning of
Chap. 19.)
[E 12] IMSL IMSL for Jam. Houston, TX: Visual
Numerics, 2001.
[El3] IMSL C Library. Houston. TX: Visual Numelics.
1002.
[E14] Kelley, C T., Iteratil'e Methods for Li11ear alld
NOlllillear Equatio11s. Philadelphia: SIAM. 1995.
[EI5J Knabner. P. and L. Angerman, Numerical Methods
.for Partial Differelltial Equatiolls. New York: Springer,
1003.
[EI6] Knuth. D. E., The Art of Computer Programmillg.
3 voIs. 3rd ed. Reading, MA: Addison-Wesley, 1005.
AJ
App.l
[E 17] Kreyszig. E., Imroductory Funcrimwl Analysis with
Applications. New York: Wiley, 1989.
[E181 Kreyszig. E.. On methods of Fourier analysis in
muItigrid theory. Lecture Notes in Pure and Applied
Mathematics IS7. New York: Dekker, 1994, pp. 22S-242.
[EI9] Kreyszig, E., Basic ideas in modem numerical
analysis and their origins. Proceedings of the Annnal
COIlference (dthe Canadian Society for the History (llId
Philosophy of Mathematics. 1997. pp. 34-4S.
[E20J Kreyszig. E.. and J. Todd. QR in two dimensions.
Elelllellte da Mathematik 31 (1976). pp. 109-114.
[E21] Mortensen. M. E., Geometric Modelin!!,. 2nd ed.
New York: Wiley. 1997.
[E22] Morton, K. W., and D. F. Mayers, Numerical Solution
of Partial Differential Equations: An Imroduction. New
York: Cambridge University Press, 1994.
[E23] Ot1ega. J. M .. Introduction to Parallel and Vector
Solution of Linear Systems. New York: Plenum Press,
19HK
[E24J Overton, M. L., Numerical Computing I\'ith IEEE
Floating Poillt Arithmetic. Philadelphia: SIAM. 2001.
[E2S] Pre~s, W. H. et aL Numerical Recipes in C: The Art
of Scientific Computing. 2nd ed. New York: Cambridge
University Press, 1992.
[E26] Shampine, L. F., Numerical Solutions of Ordinary
Differential Equations. New York: Chapman and Hall,
1994.
[E27] Varga, R. S., Matrix Iterative Analysis. 2nd ed. New
York: Springer, 2000.
[E28] Varga, R. S .. GerJgorin and His Circles. New York:
Springer. 2004.
[E29] Wilkinson. J. H.. The Algebraic Eigenl'{lilte
Problem. Oxford: Oxford University Press, 1988.
Part F. O"ltimization, Gra"lhs (Chaps. 22-23)
[f-J] Bondy, J. A., Graph Them:,' with Applications.
Hoboken, NJ: Wiley-Interscience, 2003.
[F2] Cook. W. J. et aL Combinatorial Optimi;::ation. New
York: Wiley. 1993.
[F3] Diestel, R.. Graph Theory. 2nd ed. New York:
Springer, 2000.
[F4] Diwekar. U. M., Introduction to Applied Optimi;::atimz.
Boston: Kluwer. 2003.
[FS] Gass. S. L.. Linear Programming. Method and
Applications. 3rd ed. New York: McGraw-Hill. 1969.
[F6] Gross. J. T., Handbook of Graph Them:" and
Applications. Boca Raton. FL: CRC Press, 1999.
[F7) Goodrich. M. T., and R. Tamassia, Algorithm
Design: Foundations, Analysis. and Imemet Examples.
Hoboken, NJ: Wiley, 2002.
[FH] Harm)" F., Graph TheOl}". Reprinted. Reading, MA:
Addison-Wesley, 2000.
[F91 Merris, R.. Graph Theory. Hoboken. NJ: WileyInterscience, 2000.
[FIOJ Ralston, A .. and P. Rabinowitz. A First Coune in
Numerical Analysis. 2nd ed. Mineola, NY: Dover. 200 I.
[FIIJ Thulasiraman. K., and M. N. S. Swamy, Graph
Theory and Algorithms. New York: Wiley-Interscience,
1992.
[FI2J Tucker, A.. Applied Combinarorics. 4th ed.
Hoboken. NJ: Wiley, 2001.
Part G. Probability and Statistics
(Chaps. 24-25)
[G I] Amelican Society for Testing Materials, Manual on
Presentation of Data alld Control Chart Analysis. 7th
ed. Philadelphia: ASTM. 2002.
[G2] Anderson. T. W., An Imroduction to Multivariate
Statistical Analysis. 3rd ed. Hoboken. NJ: Wiley, 2003.
[G3] Cramer. H .. Mathematical Methods of Statistics.
Reprinted. Princeton, NJ: Princeton University Press,
1999.
[G41 Dodge, Y., The Oxford Dictionary of Statistical
Terms. 6th ed. Oxford: Oxford University Press. 2003.
[GS] Gibbons. J. D., NOl/parametric Statistical Irlference.
4th ed. New York: Dekker. 2003.
[G6] Grant. E. L. and R. S. Leavenworth. Statistical
Quality Control. 7th ed. New York: McGraw-HilI.
1996.
[G7] IMSL, Fortran Numerical Librm:r. Houston. TX:
Visual Numelics, 2002.
[G8] Kreyszig, E .. Illtroductory Mathematical Statistics.
Principles {UJd Methods. New York: Wiley, 1970.
[G9] O'Hagan, T. et aI., Kendal/'s Advanced TheOlY of
Statistics 3-Volllllle Set. Kent, U.K.: Hodder Arnold,
2004.
[GIO] Rohatgi, V. K. and A. K. MD. E. Saleh, All
Imrodllction to Probability and Statistics. 2nd ed.
Hoboken, NJ: Wiley-Interscience. 200\.
'r: A P PEN D I X
,,1,
2
'I
Answers to
Odd-Numbered Problems
Problem Set 1.1, page 8
1. (cos 7T.X)/7T +
7. Second order
11. y = ~ tan ('2x
13. y = e-:L:l
C
+
5. First order
9. Third order
n7T), n = 0, ±I, ±2, ...
15. (A) No. (B) No. Only y = O.
17. )''' = g, y' = gt, Y = gt 2 /2
19. )''' = k, y' = kt + 6, y = ~kt2 + 6t, y(60) = 1800k + 360
y' (60) = 1.47·60 + 6 = 94 [rnlsec] = 210 [mph]
21. ekH = ~,H = (1n ~)/k = (1011 In 2)/1.4 = 1570 [yearsl
= 3000, k
=
1.47,
Problem Set 1.2, page 11
15. y = x(l - In x) + c
2
17. Verify the general solution y2 + t = c. Circle of radius 3Vz
19. 111V ' = 111g - bv 2 , v' = 9.8 - v 2 , v(O) = 10. v' = 0 gives the limit
V9.8 = 3.1 [meter/sec].
11. y
=
-(2/7T) cos ~7TX I- c
Problem Set 1.3, page 18
+
3. cos 2y dy = '2 dx, y = ~ arcsin (4x + c)
7. dy/y = cot 77X dx, )' = c(sin 7TX)lhT
t2
11. r = roe-
36x 2 = c, ellipses
9. y = tan (c - e- 7rx/7T)
13. I = Ioe- RtiL
15.y=ex/~
17. y = 4ln x
5. )'2
= Vln (X - 2x + e)
y' = (y - b)/(x - a), y - b = c(x - a)
yoe k = 2yo, e k = 2 (l week), e2k = 22 (2 weeks), e4k = 24
y = yoe kt = yoe- 0,OO01213t = )'oe-O.OO01213.4000 = 0.62yo; 62%; cf. Example 2.
27. y' = -ky, y = Yoe-1<t, e- 5k = 0.5, k = -(1n 0.5)/5 = 0.139,
f = -(1n 0.05)/0.139 = 22 [min]
29. T(O) = 10. T = 23 - 13ek t, T(2) = 23 - 13e2k = 18. k = -0.478, T = 22.8
gives t = [In (-0.2/-13)]/(-0.478) = 8.73 [min].
19.
21.
23.
25.
y
2
V2ih
31. h = gt 2 /2, t = Y2h/g, v = gt = gY2h/g =
33. y' = 0 - (2/800)y, y = 200e- 0,0025t, f = 300 [min], y(300)
= 94.5 [lb]
35. (A) is related to the enor function and (C) concerns the Fresnel integral C(x); see
App.3.1. (D) y' = '2.\y + I, yeO) = 0
A4
App. 2
AS
Answers to Odd-Numbered Problems
Problem Set 1.4, page 25
1. Exact. x4 + y4 = e
3. Exact. u = cos TTX sinh Y + key), u y = cos TTX cosh y + k', k' = O.
Ans. cos TTX sinh y = e
5. Exact. 9x2 + 4y2 = e
2f1
2f1
7. Exact, Mil = NT = -2e- , u = re- 2(J + k«(}), U e = -2re- + k', k' = O.
2fJ
Ans. re- 21i = e, r = ee
9. Exact. u = ylx + sin 2x + key). lIy = lIx ...!.. k' = IIx - 2 sin 2y.
AlIS. ylx + sin 2x + cos 2y = e
11. Not exact. F = 1/x2 by Theorem I. -ylx 2 dx + IIx dy = d(ylx) = O. Ails. Y = ex
13. - 3y 2/x 4 dx + 2ylx3 dy = d(\,2/x 3) = O. Y = e).2./2 (semicubical parabolas)
15. Exact, U = e 2x cos Y + key), lly = _e2x sin y + k', k' = O. AlIS. e2x cos y = e,
e = 1
17. Not exact. Try R. F = e- x , e-X(cos wx + w sin wx) dx + dy = 0, U = Y + [(x),
x
U x = [' = e-X(cos wx + w sin wx), U = Y + [ = Y - e- cos wx = c, c = 0
Y
Y
19. U = eX + key), u y = k' = -1 + e , k = -y + e • Ans. eX - y + e Y = e
21. B = C, !Ax2 + Cxy + !Dy2 = e
Problem Set 1.5, page 32
3. y
7.
9.
13.
17.
=
Cf;,-3.5x
+
0.8
ee- kx
5. y = 2.6e-1.25x + 4
2
e /(x13k if k =1= 0
11. y = 2xe cos 2x
15. Y = e llX (x2 + c), c = 4.1
+ c (if k = 0). y =
+
Separate. y - 2.5 = c cosh4 1.5x
y = sin 2x + c/sin2 2x, e = 1
y = (c + cosh lOx)/x3 • Note (X 3 y)' = 5 sinh lOx.
y = x
19. )' = l/u
!
,
1I
= ce- 57
.x
-
6.5
5.7
-
21. u = y-2 = ecc\ l + ce2x ), c = 3, u(O) = 4
23. Separate variables. y2 = 1 - ce cos x, e = - 1/e
25 • .v' = Ry + k. y = ce Rt - klR. c = Yo + /.JR. Yo = 1000, R = 0.06.
t = 65 - 25 = 40, k = 1000. Y = $178076.12. StaJt at 45 gives
Yo[(I + 1I0.06)eo.o6.20 - 110.06] = 41.988732yo = 178076.12, Yo = k = $4241.05.
27. y' = 175(0.0001 - y/450), yeO) = 450· 0.0004 = 0.18,
y = 0.135e-O.3889t + 0.045 = 0.1812,
e-O.3889t = (0.09 - 0.045)/0.135 = 1/3.
t = (In 3)/0.3889 = 2.82. AilS. About 3 years
29. y' = A - ky, yeO) = o. y = A(l - e-kt)/k
31. y' = By2 - Ay = By(y - AlB), A> 0, B > O. Constant solutions y = 0, y = AlB.
y' > 0 if y > AlB (unlimited growth),.v' < 0 if 0 < y < AlB (extinction).
y = AI(ceAt + B), yeO) > AlB if c < 0, yeo) < AlB if e > O.
33. y' = y - y2 - 0.2y, Y = 11(1.25 - 0.75e- O . 8t ), limit 0.8, limit 1
35. y' = y - 0.25y2 - O.ly = 0.25y(3.6 - y). Equilibrium harvest 3.6,
y = 18/(5 + ce-O. 9t )
37. (YI + )'2)' + P(YI + Y2) = c.v/
39. (YI + .\"2)' + P(YI + Y2) = (YI'
41. Solution of eyt' + PQ'l = e(y/
+ PYI) + (1'2' + Ph) = 0 + 0 = 0
+ PYI) + (Y2' + PY2) = r + 0 = r
+ PYl) = cr
A6
App. 2
Answers to Odd-Numbered Problems
43. CAS Experiment (a) y = x sin (lIx) + c\. c = 0 if y(2/rr) = 2/rr. y is undefined at
x = 0, the point at which the "waves" of sin (11x) accumulate; the factor x
makes them smaller and smaller. Experiment with various x-intervals.
(b) )' = x"Lsin (llx) + c]. y(2/rr) = (2/rr)n. n need not be an integer. Try n = ~.
Try n = - 1 and see how the "waves" near 0 become larger and larger.
45. y = uy*, y' + py = u'y* + uy*' + puy* = u'y* + u(y*' + py*) = u'y* + U' 0
= r, u' = r/y* = re Jp da', U = I eJp d.e r dx + c. Thus, y = UYh gives (4). We shall
see that this method extends to higher-order ODEs (Secs. 2.10 and 3.3).
Problem Set 1.6, page 36
1. y' = 4, .v' = -1/4, Y = -x/4 + c*
3. y/x = c, y'lx = Y/X2, y' = ylx,)" = -X/y,)'2 + x 2 = c*, circles
5.2xy + x\" = 0, y' = -2y/x, y' = x/(2y), y2 - x2/2 = c*. hyperbolas
7. ye-.l:2/2 = c, y' = xy, y' = -1I(xy), yy' = -l/x, y2/2 = -In Ixl + c**,
x = c*e- y2/2 , bell-shaped curves (with x and y interchanged)
9. y' = -4x/y. y' = )il4x, 4 In l}il = In Ixl + c':'*. x = C*J4. parabolas
11. xe- yl4 = c, y' = 4/x,}i' = -x/4, y = -x 2 /8 + c*
13. Use dy/d\ = I/(dx/dy). (y - 2x)e'" = c, tv' - 2 + Y - 2x)e" = 0,
y' = 2 - Y + 2x, dxld"y = -2 + v - 2:r is linear,
dx/dy + 2x = Y - 2, x = c*e- 2y + y/2 - 5/4
15. II = c, uxdx + uydy = 0, y' = -u,Ju y. TrajectOlies y' = uy/ux- Now v = c*.
v,rdx + vydy = 0, y' = -v:r/Vy. This agrees with the trajectory ODE in u if
U.l , = uy (equal denominators) and u y = -v.~ (equal numerators). But these are just
the Cauchy-Riemann equations.
17.2r + 2y.v' = o. y' = -x/yo Trajectories y' = 5h. In IJI = In Ixl + c**, y = c*x.
19. y' = -4.\19)'. Trajectories}i' = 9}'14x. y = c*x9/4 (c* > 0). Sketch or graph these
curves.
Problem Set 1.7, page 41
1. In Ix - xol < a; just take b in ex = b/K large. namely, b = aK.
3. No. At a common point (Xl> .vI) they would both satisfy the "initial condition""
.v(Xl) = Yl, violating uniqueness.
5.)" = f(x. y) = rex) - pCr))': hence af/ay = -p(x) is continuous and i~ thus
bounded in the closed interval Ix - xol ~ iI.
7. R has sides 2a and 2b and center (1, 1) since y(l) = 1. In R,
f = 2y2 ~ 2(b + 1)2 = K, a = b/K = b/(2(b + 1)2). da/db = 0 gives b = 1, and
a opt = b/K = 1/8. Solution by dy/)'2 = 2 dx, etc., y = 11(3 - 2x).
9. 11 + .v 2 1 ~ K = I + b 2 , a = b/K. da/db = O. b = 1. a = 1/2.
Chapter 1 Review Questions and Problems, page 42
+ ~) = 4 dx. 2 arctan 2y = 4x + c*. y = ! tan (21' + c)
= l/u, y' = -u' /u 2 = 4/u - 1/112, 1/ = c*e-4.r + ~
15. dy/(y2 + 1) = x 2 dx, arctan y = x 3/3 + c, y = tan (x 3/3 + c)
17. Bernoulli.),' + xy = x/y, u = )'2, II' = 2)y' = 2x - 2ru linear,
2 J X2")
U = e'
(e' ~X dr + c) = 1 + ce- x , l' = V I u. Or write
,
(2
1) and separate.
.
)y = -x y 11. dy/(y2
13. Logistic ODE. y
-:t:
Z
--\.
App. 2
A7
Answers to Odd-Numbered Problems
19. Linear, y = e Cos xU e- cos x sin x dx
+
ce cos x
+ 1. Or by separation.
21. Not exact. Use Theorem 1, Sec. 1.4: R = 2/x, F = x 2 : the resulting exact ODE is
3x 2 sin 2y dx + 2x 3 cos 2y dy = d(x 3 sin 2y), x 3 sin 2y = c. Or by separation,
cot 2y dy = - 3/(2x) d.r. etc., sin 2y = n·- 3 .
23. Exact. /I = I M dx = sin xy - x 2 + k, lIy = X cos xy + k' = N, k = y2,
sin .l)" - x 2 + y2 = c.
25. Not exact. R* = 1 in Theorem 2, Sec. 1.4, F* = e Y • Exact is
e Y sin (y - x) dx + eY[cos (y - x) - sin (y - x)J dy = O.
II = I M dx = eY cos (y - x) + k, lIy = eY(cos (y - x) - sin (y - x» + k' = N,
eY cos (y - x) = c.
27. Separation. )'2 + x 2 = 25
29. Separation. )' = tan (x + c), c =
31. Exact. u = X\2 + cos X + 2-" = c, c = 1I(0, 1) = 3
33. y' = x/yo Trajectories y' = -Yfx. y = c*/x by separation. Hyperbolas.
35. Y = Yoekt, e 4k = 0.9, k = ! In 0.9, e kt = 0.5,
f = (In 0.5)/k = (In 0.5)/[(ln 0.9)/4] = 26.3 [daysl
37. ekt = 0.01, t = On O.OI)/k = 175 [days]
39. y' = -4x/)'. Trajectories y = CIX1l4 or X = C2y4
41. Logistic ODE y' = Ay - By2, Y = 1/1/, U' + All = +B, 1I = ce- At + B/A
43. A = amount of incident light. A thin layer of thicknes!> ,lx absorbs M = -kALh
(-k = constant of proportionality). Thus ,lA/b.x = -kA. Let b.x ~ O. Then
A' = -kA. A = Aoe-kl: = amount of light in a thick layer at depth x from the
surface of incidence.
c)
=
-!7T
Problem Set 2.1, page 52
1. \" = 2.5e4x + 0.5e- 4x
3. y = e- x cos x
9. Yes if a 1= 0
15. F(x, z, z') = 0
7. Yes
13. No
19. .'" dddy = 4:::, y = (CIX + C2)-1I3
21. (dddy)~ = _.::3 sin y, -11.:: = -dx/dy
23. y"y' = 2. y = ~(t + 1)3/2 - i. ),(3) =
25. y" = ky',.::' = k.:: . .:: = cle kx = y', Cl
5. Y = 4x2
11. No
= cos y + CI. X = -sin y +
3l, /(3) = 4
= 1, Y = (e kx - l)/k
+ 7/x 2
ClY
+
C2
Problem Set 2.2, page 59
1. Y = cle7X + C2 e - x
5. y = cleO.9X + C2e-L1X
9. y = cle3.5X + C2e-1.5X
13. y = Cl e12,- + C2e-12X
3. Y = (ci + c2x )e2.5x
7. y = eO. 5X(A cos I.5x + B sin 1.5.\)
11. y = A cos 3rr.r + B sin 3rr.r
15. y" - 3y' + 2y = 0
19. y" - 16y = 0
23. Y = e- 2x(2 cos x - sin x)
17. -" " - 2'[;;3'
v.J y + 3Y = 0
3x
21. Y = 4e - 2e- X
27. y = (2 - 4x)e-O. 25x
25. \" = 2 + e- 7TX
29. y = e-o.1:r(3.2 cos 0.2x + 1.6 sin 0.2x)
31. \" = 4e5X - 4e- 5J;
x
x
x
33')"1 = e- '.'"2 = O.OOle + e· E
•
,
1
,
35. W nte
= e -a:"C/2 , c = cos wx, s = S1l1
wx. Note that £ = -"2a£, C = -ws,
2
s' = we. Substitute, drop £, collect c-terms, then s-terms, and use w = b - !a 2 ,
to get c(b - !a 2 + !02 - w 2) + s( -ow + !aw + ~aw) = 0 + 0 = O.
A8
App. 2
Answers to Odd-Numbered Problems
Problem Set 2.3, page 61
1. O. 0, -2 cos x
3. -0.8 X 3
S. -12x 3 + 9x2 + 8x - 2. -28 sin 4x - 4 cos 4x. 0
7. Y
11. y
=
=
2X
(Cl + c2x )eCle-3.1X + C2e-x
+
6x
2
+ 0.4,
O. eO. 4x
= e- 3X(A cos 2x + B sin 2x)
13. y = A cos 4.2wx + B sin 4.2wx
9. y
Problem Set 2.4, page 68
1. Y = Yo cos wof + (uo/wo) sin wof. At integer f (if Wo = 7T), because of periodicity.
3. lIlLe" = -lIlg sin e = -mge (tangential component of W = lIlg). e" + w02e = O.
WO/(27T)
=
\/i/i/(27T).
5. No. because the frequency depends only on kIm.
7. (i) Greater by a factor vi (ii) Lower
9. w* = [w0 2 - c 2/(41112)1112 = wo[1 - c 2/(411lk)]112 = wo(l - c 2/8111k) = 2.9583
11. 27T/W* since Eq. (10) and y' = 0 give tan (w';'f - 8) = -a/w*; tan is periodic
with period 7T/W"'.
13. Case (II) of (5) with c = "\, '417lk = V 4· 500' 4500 = 3000 [kg/secJ. where 500 kg
is the mass per wheel.
15. y = [Yo + (uo + aYo)f1e- at , Y = [1 + (uo + l)t]e- t ; (ii) u o = -2. -3/2, -4/3,
-5/4, -6/5
17. Y = 0 gives Cl = -C2e-2{Jt, which has one or no positive zero, depending on the
initial conditions.
Problem Set 2.5, page 72
1. CIX3
+
C2X-2
5. xlA cos (In Ixl)
9. CIXO.1 + C2XO.9
+
13. x-o. 5 [2 cos (10 In
B sin (In
Ixl) -
Ixl)]
sin (10 In
Ixl)]
3. (Cl + C2 In Ixl )x4
7. CIX1.4 + C2X1.6
11.3x2 - 2x 3
15.2.\"-3 + 10
Problem Set 2.6, page 77
1. y" - 0.25.'" = 0, W = -1
3. y" - 21..;/ + k 2 y = 0, W = e2kx
5. x 2/ ' + 0.5x/ + 0.0625), = 0, W = x-O. 5 7. x2y" + xy' + 4y = 0, W = 2/x
9. x2y"
0.75.'" = O. W = -2
11. y" - 6.25." = O. W = 2.5
13. y" + 2/ + 1.64." = O. W = 0.8e- 2x
15. y" + 5/ + 6.34y = 0, W = 0.3e- 5X
7 6rr
17. y" + 7.67Ty' + 14.44~\" = 0, W = e- . :,.
Problem Set 2.7, page 83
1. cIe- x
+ c 2e- 2x + 2.5e2x
3. cIe4X + C2 e - 4X + 2.4xe4.'t
3
5. Cle2X + C2e-3X - x - 3x - 0.5
7. e- 3X(A cos 8x + B sin 8x) + eX(cos 4x + ! sin 4x)
9. c 1e-O. 4x + C2eo.4.'t + 20xeo.4.'t - 2Q,e-O. 4x
11. Cl cos 1.2x + C2 sin 1.2x + lOx sin 1.2x
2T
13. e- (A cos x + B sin x) + 5x 2 - 8x + 4.4 - 1.6 cos 2x + 0.2 sin 2x
15. 4x sin 2x
-
4e x
App. 2
A9
Answers to Odd-Numbered Problems
17. e-O. IX ( 1.5 cos 0.5.t - sin 0.5.\:) + 2eO. 5x
19. 2e- 3X + 3e4x - 12.\"3 + 3x 2 - 6.5x
Problem Set 2.S, page 90
1. -0.4 cos 3t + 7.2 sin 3t
3. -12.8 cos 4.5t + 3.6 sin 4.5t
5.0.16 cos 2t + 0.12 sin 2t
7. 475 cos 3t - 415 sin 3t
9 • c 1 e- t/2 + C 2 e- 3tJ2 - 32
5 cos t - 1
5 sin t
11. (ci + c2t)e-3t/2 - ~ cos 31 - sin 3t
13. e-1. 5t (A cos t + B sin t) + 4 + 0.8 cos 2t - 6.4 sin 2t
15. 0.32e- t cos 5t + 0.68 cos 3t + 0.24 sin 3t
17. 5e- 41 - 4e- 2t - 0.3 cos 2t + 0.1 sin 2t
19. e-1. 5t (O.2 cos t - 1.1 sin t) + 0.8 cos t + 0.4 sin t
Problem Set 2.9, page 97
1. LI'
3. Rl'
+ Rl = E,1 = (E/R) + ce- RtJL = 2.4 +
+ lIC = O. I = ce-tJ(RC)
ce- 50t
5. I = 5(cus t - cos 1Ot)/99
7.10 is maximum when S = 0; thus C = 1/(w2 L).
9. R > Rcrit = 2VLIC is Case I. etc.
11.0
13. c l e- 20t + C2e-lOt + 16.5 sin lOt + 5.5 cos lOt
15. £' = -e- 4t (7.605 cos ~t + 1.95 sin ~t), I = e-o.lt(A cos ~t + B sin ~t)
- e- 4t cos ~t
17. £(0) = 600. I' (0) = 600, 1 = e- 3t ( - 100 cos 4t + 75 sin 4t) + 100 co'> t
19. (b) R = 2 fl, L = 1 H, C = 1112 F, £ = 4.4 sin lOt V
Problem Set 2.10, page 101
+ B sin x - x cos x + (sin x) In Isin xl
3. CIX + C2X2 - X cos x
5. (cos X)(ci + sin .\" - In Isec x + tan xl) + (sin X)(C2 - cos x)
= (cl - In Isec x + tan xl) cos x + C2 sin x
7. (CI + ~x) sin x + (C2 + In Icos xl) cos x
9. (Cl + c2x)ex + x 2 + 4x + 6 - eXOn Ixl + I)
11. c] cos 2x + C2 sin 2x + ~.\: cosh 2x
13. CIX + C2X2 - x sin x
1. A cos x
15. A cos x + B sin r + Ypi + Y p 2. Ypi as .\"p in Example 1, .\"p2 = ~~ sin 5x
17. lI" + 1I = 0 by substitution of Y = lIX- I/2 . YI = x- 1I2 cos x . .\"2 = x- 1I2
1I2 sin from (2) with the ODE in standard
sin x. Yp = _~x1l2 cos X +
form.
ix-
x
Chapter 2 Review Questions and Problems, page 102
9.
4
cle ."
11. e-
4X
+
C2e-2X -
(A cos 3r
+
1.1 cos 6x - 0.3 sin 6x
B sin 3x) - ~ cos 3x
+
~ sin 3x
A10
App. 2
Answers to Odd-Numbered Problems
13. Y1 = x 3 , Y2 = x- 4 , r = x- 5 , W = -7x- 2 , Yp = - 412X-3 - ~X-3 = -*x- 3
15. )'1 = eX, Y2 = xe x , W = e2.T, Yp = e X I(2x)
17. -'"1 = eX cos X')'2 = eX sin x, W = e 2x , yp = -xe x cos x + eX(sin x) In Isin xl
19. y = 4e 2x + 2e- 7x
21. Y = 9x- 4 + 6x 6
2
3x
23. Y = e-2.1' - 2e- + 18x - 30t" + 19
25. Y = ~X3 + 4x 2 - 5x- 2
27. Y = -16 cos 2t + 12 sin 2t + 16(cos 0.5t - sin 1.5t).
Resonance for wl(27T) = 2/(27T) = 1/7T
29. w = 3.1 is close to Wo = -vkj;, = 3, Y = 25(cos 3t - cos 3.1t).
31. R = 9,0, L = 0.5 H. C = 0.025 F, E = 17 sin 6t V, hence 0.5/" + 91' + 401
= 102 cos 6t, 1= -8.16e- 8t + 7.5e- lOt + 0.66 cos 6t + 1.62 sin 61
33. E' = 220·314 cos 314t, I = e- 50t(A cos 150t + B sin 150r) + 0.847001 sin 3141
- 1.985219 cos 314t
Problem Set 3.1, page 111
7. Linearly independent
11. xlxl = x2 if x > 0, linearly dependent
13. Linearly independent
17. Linearly independent
9. Linearly dependent
15. Linearly independent
19. Linearly dependent
Problem Set 3.2, page 115
+ 11y' -
=
3. yiv - Y = 0
7. C1 + C2 cos x -I:=. C3 sin x
9. C1ex + (c2 + c3x)e-·"
11. C1 ex + c2il+v7lX + c 3e(1-\ 7)x
13. eO. 25x + 4.3e-O. 7X + 12.1 cos O. Lt - 0.6 sin O.Ix
15. 2.4 + e-1.6x(cos 1.5x - :2 sin 1.5x)
17. y = cosh 5x - cos 4x
19. y = c 1x- 2 + C2X + C3X2. W = 121x2
1. Y'" - 6y"
5. /v + 4-,""
=
6-,"
0
0
Problem Set 3.3, page 122
1. (Cl + c2x)e2x + C3e-2.T - 0.04e- 3x + x 2 + X +
3. c] cos ~x + C2 sin ~x + X(C3 cos ~x + C4 sin ~x) - ~e-x sin !x
5. C1XO.5 + C2X + C3X1.5 + 0.1.\.5·5
7. C] cos X + C2 sin x + C3 cos 3x + C4 sin 3x
9. Y = (4 - x 2)e 3X - 0.5 cos 3x + 0.5 sin 3x
11. x- 2 - x 2 + 5x 4 + x(In x + I)
13. 3 + ge- 2x cos 9x - (1.6 - 1.5x)e X
+ 0.2 cosh
2x
Chapter 3 Review Questions and Problems, page 122
7. Cl + C 2x 1/2 + C3X-1/2
9. cle-O. 5x + C2 e O. 5X + C3e-L5x
2
11. Clx (i In x - ~) + C2X2 + c3.,· + C4 + tx 7
13. C1e-x + e X / 2(c2 cos (~V'3x) + C3 sin (~v'3x» + 8ext2
15. (c1 + c2x)eX + C3e-x + 0.25x 2e x
17. -0.5x- 1 + 1.5x- 5
19. cos 7x + e 3x - 0.02 cosh x
App. 2
An
Answers to Odd-Numbered Problems
Problem Set 4.1, page 135
1. Yes 5. y~ = 0.02(-)'1 + )'2), y~ = 0.02(\'1 - 2)'2 + )'3)')'~ = 0.02(.1'2 - )'3)
7. Cl = 1, C2 = -5
9.3 and 0
2t
11. )'~ = )'2, y~ = 4Yb .1'1 = c 1e- + C2 e2t = y, Y2 = v~
13. y~ = Y2' y~ = )'2, eigenvalues 0, 1')'1 = Cl + C2 e t, Y2 = y~ = y'
IS.)'~ = )'2, )'~ = 0.109375.1'1 + 0.75)'2 (divide by 64). Yl = cle-O.125t + c2eO.875t
Problem Set 4.3, page 146
+ C2e 6t , )'2 = -2Cle-6t + 2c2e6t
3')'1
Cle2t + C2, )'2 = Cl e2t - C2
5. Yl = Cle4it + C2e-4it = (Cl + C2) cos 41 + i(CI - C2) sin 41
= A cos 4t + B sin 4t, Y2 = iC1e4it - iC2e-4it
= (iCI - iC2) cos 4t + i(iCI + iC2) sin 4t = B cos 4t - A sin 4t, A
1.
)"1
=
=
c 1e- 6t
= Cl
+
C2.
B = i(CI - C2)
2e 1 + C2e-6t • .1'2 = -Cl + C3e-6t, )'3 = -Cl + 2(C2 + c 3 )e-6t
9f + 2c e-1. 8t , )'2 = 2Cle1.8t + c e- O.9t - 2C3e-1.8t,
)"1 = c1e1.8t + 2c2e-O.
2
3
)'3 = 2Cle1.8t - 2C2e-O.9t + C3 e -1. 8t
11')'1 = 10 + 6e 2t , )'2 = -5 + 3e 2t
13. )"1 = 2.4e- t - 2e2.5t , .1'2 = 1.8e- t + 2e2.5t
15')'1 = 2e 14.5t + 10, Y2 = 5e 14.5t - 4
7.
9.
)'1
=
17. )'2 = .1'~ + Yb Y~ = )'~ + )'~ = -)'1 - Y2 = -Yl - ()'~ + .1'1), y~ + 2y~ + 2Y1 = 0,
Y1 = e-t(A cos 1 + B sin t), )'2 = y~ + )'1 = e-t(B cos t - A sin t). Note that
r2 = Y12 + .1'22 = e- 2t (A 2 + B2).
19.11 = 4c1e-200t + C2 e - 50t , 12 = -Cle-200t - 4C2e-50t
Problem Set 4.4, page 150
1. Saddle point, unstable, .1'1 = Cle-4t + c 2e 4t, )'2 = -2c 1e- 4t + 2c2 e 4t
3. Unstable node . .1'1 = Clet + C2e3t, )'2 = -Clet + C2e3t
5. Stable and attractive node, .'11 = Cle-3t + C2e-5t, .'12 = c 1e- 3t - c 2 e- 5t
7. Center, stable'.\'1 = A cos 41 + B sin 4t, .1'2 = -2B cos 4t + 2A sin 4t
9. Saddle point, unstable, .1'1 = Cle3t + C2e-t, .1'2 = C1e3t - C2e-t
11.)\ = Y = c 1e kt + c 2 e- kt ')'2 = y', hyperbolas k 2.'112 - Y2 2 = const
13. )' = e- 2t (A cos t + B sin t), stable and attractive spirals
17. For instance, (a) -2, (b) -1, (c) -~, (d) 1, (e) 4.
Problem Set 4.5, page 158
(9,
0), y~ = )'2, Y~ = 3.1'1, saddle poim; (0, -1)')'1 = 5\')'2 = -1 + Y2, y~ = -Y2,
)1 = 35\. center
3. (0, 0), Y~ = 4.1'2. Y~ = 2Yl, saddle point; (2, 0), Yl = 2 + )lb )'2 = )12' Y~ = 4Y2,
y~ = -2Yl' center
5. (0, 0), Y~ = -Yl + )'2, y~ = - Yl - .1'2' stable and attractive spiral point; (-2, 2),
)'1 = -2 + )11> Y2 = 2 + Y2, Y~ = -.h - 3.V2, y~ = -j\ - )12, saddle point
1.
App. 2
7 • .\'~
Answers to Odd-Numbered Problems
=
.\"2' Y~
= -.\"10 - 4)'1), (0, O),)'~ = .\"2, .,.~ = -."1, center;
+ Yb )'2 = 5'2, Y~ = 5'2, Y~ = (-! - 5'1)( -4Yl)' y~ =
(!, 0)'.\'1 = !
Yl' saddle
9. (~7T ::!: 2117T, 0) saddle points; (-~7T ::!: 2117T, 0) centers.
Use -cos (::!:~7T + :VI) = sin (::!:5'1) = ::!:Yl'
11 . .\"~ = .\'2, y~ = -)'1(2 + )'1)(2 - .\"1)' (0, 0), y; = -4.\"1' center; (-2, 0), y~ = 85'1'
saddle point; (2, 0), y~ = 85'1 saddle point
13. )'''/y' + 2y' /)' = 0, In y' + 2 In Y = c, y' y2 = Y2Y] 2 = const
15.), = A cos t + B sin t, radius YA 2 + B2
Problem Set 4.f page 162
3. Yt = A cos 4t + B sin 41 + ~~, )'2 = B cos 4t - A sin 4t - ~t
5. )'1 = Cle4t + C2e-3t + 4, Y2 = c 1e 4t - 2.5c2e-3t - 10
7. Y1 = 2cle-9t + C2e-4t - 90t + 28'.\"2 = Cle-9t + C2e-4t - J26t + 14
9')'1 = Clet + 4c2e 2t - 3t - 4 - 2e- t ')'2 = -Clet - 5C2e2! + 5t + 7.5 + e- t
11')'1 = 3 cos 2t - sin 2t + t + 1,)'2 = cos 2t + 3 sin 2t + 2t - ~
13'."1 = 4e- t - 4et + e 2t'."2 = -4e- t + t
15'."1 = 7 - 2e 2t + e 3t - 4e- 3t , )'2 = _e 2t + 3e- 3t
17. I~ + 2.5(ft - 12) = 845 sin t, 2.5(1~ - I~) + 2512 = 0,
11 = (95 + 162.5t)C 5t - 95 cos t + 312.5 sin t,
12 = (-30 - 162.5t)e-5t + 30 cos t + 12.5 sin t
19. I~ + 2(11 - 12) = 200, 2(12 - It) + 812 + 2 I 12 dt = O.
II = 2cleA,t + 2C2e"~2t + 100,
12 = (1.1 + Vo:4T)cleA,t + 0.1 - \"0.41)C2eAzt, Al = -0.9 + \,'0.41,
A2 = -0.9 - v'6AT
Chapter 4 Review Questions and Problems, page 163
11')'1 = Cle8t + C2e-8t, )'2 = 2c 1e 8t - 2c2e-8t. Saddle point
13. ."1 = Clet + C2e-6t, ."2 = Clet - 6c2e-6t. Saddle point
15. )'1 = Cle7.5t + C2e-3t, Y2 = -Cle7.5t + 0.75c2e-3t. Saddle point
17. ."1 = Cle5t + C2et, .\'2 = Cle5t - c2e t . Unstable node
19. Yl = e-t(A cos 2t + B sin 2t), )'2 = e-t(B cos 2t - A sin 2t). Stable and
attractive spiral point
21. )'1 = Clet + C2e-t + e 2t + e- 2t , )'2 = -C2e-t - 1.5e- 2t
23')'1 = Clet + C2e-2t - 6e- t - 5. Y2 = -Clet - 2c2e-21- + 10e- t + 6
25• .\"1 = Cle3t + C2e-t + t 2 - 2t + 2, )'2 = c 1e 3t - C2e-t - t 2 + 2t - 2
27. A saddle point at (0, 0)
29. I] = 4e- 40t - e- lOt , 12 = _e- 40t + 4e- lOt
31. (117T, 0) center for even 11 and saddle point for odd n
33. Saddle points at (0, 0) and (~, ~), centers at (0, ~) and (~, 0)
(oblem Set 5. 1 page 170
1. aoO + x + ~X2 + ... ) = aoe x
3. aoO - 2X2 + ~X4 - + ... ) + al(x - ~x3
= ao cos 2,r + ~al sin 2x
+
I~X5 -
+ ... )
App. 2
AU
Answers to Odd-Numbered Problems
5. ao(1 + ~x)
7. ao + aox + (~ao
+ ~)x2 + ... = aoe x + eX - x - I = ce x 9. ao + a1x + ~alx2 + ... = ao - a l + alex
11. s = ~ - 4x + 8x 2 - 3ix 3 + 3ix 4 - I;:X5, s(O.2) = 0.69900
13. s = ~ + ~x - 1s x 3 + ~OX5, sCI) = 0.73125
15. s = I + x - x 2 - ~X3 + ~X4 + ~!X5, s@ = ~~~
x-I, c
= ao + 1
Problem Set 5.2, page 176
1. lei
5.0
3. 2 (as function of t = lx - 3)2). Ans.
9. 1
13.
V2
7. 2
11.
7T
15.
L
(_1)S-l
L
xS'R
5(s - 2)
s~3
= 1
,
(s - 4)2
s~5
(s-3)!
xS ' R =
Cf)
'
+ tx 3 + 2~X4 - :14x5 - ... )
19. ao + aI(x - ~X3 + ~X5 - 2i x 7 + 227X9 - I~5Xll + - ... )
21. lIo(l - ~X2 - :14x4 + 7I~ox6 + ... ) + aI(x - tx 3 - 214X5 + 1O~SX7 + ... )
23. ao(1 + x 2 + x 3 + X4 + x 5 + x 6 + ... ) + alx
17. ao(l - l2X4 _lo-\:5 - ... )
+
a1(x
+
~x2
Problem Set 5.3, page 180
3. P 6lx) = I~l231x6 - 315x4 + 105x2 - 5),
P 7lx) = I~(429x7 - 693x 5 + 315x 3 - 35x)
7. Set x = az.. y = cIPn(x/a) + C2Qn(x/a)
=~. P 21 = 3x~, P22
2
P 4 = (l - x 2)(105x2 - 15)/2
15. P l 1
= 3(1 - x 2),
Problem Set 5.4, page 187
1.
VI
= 1+
3•."1 = 1 -
x2
~
3!
~
12
5. r(r - 1) + 4r
+
X4
~
5!
I
~-.r4
384
+
+
1
2
+ ... =
-
sinh x
x
. ..
Y2
v = 9". 1 In
'.2
=
x
x -
2 = 0, r l = -1, r2 = -2; Y1 =
+
24
720
+
x
2!
1M
3
+ x + .,.
~
4!
36
x
x
6
+
x2
2
rl
25x 4
1024
+ ... ,
120
+ - ...
7. Euler-Cauchy equation with t = x + 3, h = (x + 3)5, Y2 = ."1 In (x
9. ho = 1, Co = 0, r2 = 0, Yl = e- x , ."2 = e- x In x
11'."1 = 1I(x + 1), .1'2 = 1Ix
13. ho = ~, Co = 0,
cosh x
x
-2+
X4
=
+ 3)
= ~, r2 = 0'."1 = x I/2 (1 + 2x + 2X2 + ~X3 + ... ),
Y2=I +2x+2x + ...
15'."1 = (x - 4)7, .1'2 = (x - 4)-5 (Euler-Cauchy with t = x - 4)
3
17'."1 = X + x - I~X4 + I~X5 - 2~X6 + .. " Y2 = 1 + 3x 2 - tx 3
2
+
~~x6
-+ ...
+
~X4 - ~x5
+ ...
14
App. 2
Answers to Odd-Numbered Problems
= c1F(~, ~, ~; x) + C2 -yt;:F(l, 1,~; x)
21. y = A(l - 4x + ~X2) + B-yt;:F( -~, ~, ~: x)
23. y = c1F(2, -2, -~; t - 2) + C2(t - 2)3/2F(~,
19. Y
-~,~; t - 2)
Problem Set 5.5, page 197
1. Use (7b) in Sec. 5.2.
3.0.77958 (exact 0.76520), 0.19674 (0.22389). -0.27651 (-0.26005).
-0.39788 (-0.39715), -0.17038 (-0.17760), 0.15680 (0.15065), 0.30086
(0.30008),0.16833 (0.17165)
5. Y = C1I,,(Ax) + c2L,,(Ax:), v*" 0, ± I, .. .
7. Y = C1I,,(-yt;:) + C2LvC-yt;:), v
O. ± L .. .
9. Y = C1xI1(2x), II, I_I linearly dependent
11. y = X-"lC1I,,(X) + C2I_,,(X)],
0, ±1, .. .
13. Y = C1I,,(x3) + C2L,,(x3), v =1= 0, ± 1. .. .
15. Y = c]-yt;:1 1(2-yt;:), I]. I_I linearly dependent
17. Y = Xl/4ft(~Xl/4), II' L1 linearly dependent
*'
v*'
19. y
=
.~/\C1I8/5(4xl/4)
+ c2I_s/5(4x l/4»
= 0, (24a) with v = I, (24d) with
v = 2, respectively.
23. I n (X1) = I,,(x2) = 0 implies x 1- n 1,,(X1) = X2 -71 I,,(x2) = 0 and [x- n 1 n (x)]' = 0
somewhere between Xl and X2 by Rolle's theorem. Now use (24b) to get
1 n + l (x) = 0 there. Conversely, 1,,+ 1(X3) = 1,,+I(X4) = 0, thus
X3n-lIn+I(X3) = X4n+11n+1(X4) = 0 implies 1n(x) = 0 in between by Rolle's
theorem and (24a) with v = 11 + L
25. Integrate the formulas in (24).
27. Use (24a) with JJ = 1, partial integration, (24b) with v = 0, partial integration.
33. CAS Experiment. (b) Xo = I. Xl = 2.5, X2 = 20, approximately. It increases with
(c) (14) is exact. (d) It oscillates. (e) Formula (24b) with v = 0
21. Use (24b) with v
Problem Set 5.6, page 202
1. )' = C]15(X) + C2 Y5(X)
3. Y = C1I O(-yt;:) + C2 Yo(-yt;:)
2
5. Y = C112(X2) + C2 Y2(X )
7. y = X-\CI15(X) + C2Y5(X»
11. Set H(l) = kH c21, use (10).
9. Y = X3(c1I3(X3) + C2 Y3(X3»
13. Set x = is in (l), Sec. 5.5, to get the present ODE (12) in terms of s. Use (20),
Sec. 5.5.
Problem Set 5.7, page 209
3. Set x = ct + k.
5 ..\ = cos e. dx = -sin e de, etc.
7. Am = (11lr./5)2, 111 = 1,2, ... ;
Ym = sin tll1r.x/5)
9. Am = [(2m + 1)r.12L]2, m = 0, 1. ... :
Ym(x) = sin [(2m + l)m/2L]
2
11. Am = 111 , m = 0, 1, ... ;
Yo = 1, Ym = cos II1X, sin IIlX, III = 1,2, ...
13. k = k m from tan k = -k.
Am = k m 2 , III = 1,2, ... ;
Ym = sin kmx
15. Am = 1112, 111 = 1, 2, ... ;
Ym = x sin (m In /x/)
17• j'J -- e Sx, q -- 0, r -- e Sx, Am
,
. IIIX, III = I , 2,...
= III 2 ; Ym = e -4x SIn
19. Am = (1Ilr.)2, Ym = X cos lIlr.x, X sin 111r.X, m = 0, 1, ...
n.
App. 2
A15
Answers to Odd-Numbered Problems
Problem Set 5.8, page 216
3. ~P3(X) - ~P2(X)
1. 1.6P4 (x) - 0.6Po(x)
7. -0.4775P l (x) - 0.6908P3(x)
+ ~Pl(X) - ~Po(.x)
+ 0.1544Pg (x) + ... ,
+ 1.844P5 (x) - 0.8234P7 (x)
= 9. Rounding seems to have considerable influence in Probs. 6-15.
9. 0.3799P2(x) + 1.673P4 (x) - 1.397P6 (x) + 0.3968Ps(x) + ... . 1110 = 8
11. 1.175Po(x) + 1.l04Pl (x) + 0.3575P2(x) + 0.0700P3(x) - .... 1Il0 = 3 or 4
1110
13. 0.7855Po(x) - 0.3550P2(x) + 0.0900P4 (x) - ... ,
= 4
1110
15. 0.1212Po(x) - 0.7955P2(x) + 0.9600P4 (x) - 0.3360P6 (x)
2
17. (c) am = (2IJ1 (aO;rn»(J1(aO. m )/ao,m) = 2/(ao. m ll(aO. m
»
+ .... 1Il0 =
8
Chapter 5 Review Questions and Problems, page 217
11. e 3x , e- 3x , or cosh 3x, sinh 3x
13. eX, 1 + x
15. e-X'-, xe- x2
17. e- x , e- x In x
2
2
19. I/(l - x ), x/(l - x ) or 1/(1 - x), 1/(1 + x)
21. Y = ("11V2(6x) + c21 _ v2(6x)
23. Y = Clit (x2)
+
25. y = ~[clll/4(~kx2) + ['21 _l/4(!kx 2)J
27. Am = (21117Tl, Yo = I, Ym = cos nl17TX, sin 2m7Tx,
I, 2, ...
111 =
2
C2 Yl (x )
o.
29. Y = c l l l (kx) + C2Yl(kx). C2 = O. y(l) = c l l l (k) =
k = k m = al,m (the positive
zeros of 1 1 ), Ym = l l (a1.",x)
31. 1.813Po(x) + 2.923P l (x) + 1.759P2(x) + 0.663P3(x) + 0.185P4 (x) + ...
33. 0.693Po(x) - 0.285P2(x) + 0.144P4 (x) - 0.09IP6 (x) + ...
35.0.25Po(x) + O.5P1 (x) + 0.3 125P2(x) - 0.0938P4 (x) + O.0508P6 (x) + ...
Problem Set 6.1, page 226
2
2
s
s
s cos () - w sin ()
7. ---;;2:------:2;:---
+
S
k
13. s
(1 -
s-2
s
1."""3-"""2
w
e- bs )
5.
15.
1
1
9.--s + 2b
1 - (1
2
(s - 2) -
11.-S2 + 4
1 - e- bs
be- bs
17. ---;;:--- - - -
+ 2s)e- 2S
----;;2---
2s
s
S2
(I - e- s )2
19. - - - s
23. Set ct
= p.
Then :£(f(ct»
=
{'oC e-stf(ct) dt L=e-(s'C)Pf(p) dp/c
=
= F(s/c)/c.
0
29. 4 cos
7Tt -
3 sin
7Tt
35.2 - 2e- 4t
39.
45.
1
Vs
sin
,,1st -
37. (ev'3t - e-V5t)/(V3
e- 5t
41.
(s -
3.8
2
2.4)
+ k) + b
2
+ k) + 1
a(s
(s
51. 3e- 2t sin 5t
53. e- 5 "1Tt sinh 7Tt
+ Vs)
5w
43.
(s
2
+ a) +
w
2
A16
App. 2
Answers to Odd-Numbered Problems
Problem Set 6.2, page 232
1
1.
(s - k)
2
7. (S2 + ~17"2)2
9. Use shifting. Use cos 2 0:' = ~ + ~ cos 20:'; use cos 2 0:' + sin2 0:' = 1.
Ans. (2S2 + 1) 1[2s(S2 + 1)]
11. (s + ~)Y = -1 + 17· 21(s2 + 4), y = 7e-t{2 + 2 sin 2r - 8 cos 2t
13. (S2 - ~)Y = 4s, y = 4 cosh ~t
15. (S2 + 2s + 2)Y = s - 3 + 2 . 1. Y = (s + 1 - 2)/[(s + 1)2 + 11,
y = e-t(cos t - 2 sin t)
17. (S2 + 7s + 12)Y = 3.58 - 10 + 24.5 + 211(s - 3), Y = ~e3t + ~e-4t
19. (s + 1.5fY = s + 31.5 + 3 + 541s 4 + 64ls,
Y = lI(s + 1.5) + lI(s + 1.5)2
y = (1 + t)e-1. 5t + 4t3 - 16t2
+ ~e-3t
241s4 - 321s 3 + 321s2,
32t
21. t = t + 2, l' = 4/(s - 6), ji = 4e 6t, y = 4e 6Ct - 2)
23. t = t + 1, (s - l)(s + 4)1' = 4s + 17 + 6/(s - 2), Y = 3et - 1 + e 2Ct - ll
25. (b) In the proof, integrate from 0 to 1I and then from II to 0: and see what happens.
(c) Find 3:;(f) and 3:;(f') by integration and substitute them into (1 *).
27.2 - 2e-
t/2
29.
+
+
1
""k.2
(e
kt
~ r;::
31. cosh v5t - 1
t
k
- 1) -
33. t sinh 2t - ~t
Problem Set 6.3, page 240
3. (l - e 2 -
2S
(~
S3
~
S2
5.
7.
+
s
S2
+
17"2
)
I(s - 1)
+
~)
e- s - (~
S
S3
+
~
S2
e
(-S21 + -10)
s
( -e -s - e -4s)
-20s
(e- 3s
S2 + 17"2
1
13. - - (e- 2s + 27T _ e- 4s + 47T )
s - 17"
15. 0 if t < 4, t - 4 if t > 4
17. sin t if 217" < r < 817", 0 elsewhere
19. 0 if t < 2, (t - 2)4/24 if t > 2
21. lI(t - 3) cosh (2r - 6)
t
23. e- sin t
25. e- 2t cos 3t + 9 cos 2t + 8 sin 2t
27. sin 3t + sin r if 0 < t < 17" and ~ sin 3t if t > 17"
29. t - sin t if 0 < t < 1, cos (t - 1) + sin (t - 1) - sin t if t > I
31. et - sin t + u(t - 217")(sin t - ~ sin 2t)
33. t = 1 + t, 'f" + 4ji = 8(1 + t)2(1 - lI({ - 4», cos 2t + 2r2 - I if t < 5,
cos 2r + 49 cos (2r - 10) + 10 sin (2t - 10) if t > S
35. Rq' + qlC = 0, Q = 3:;(q), q(O) = CVo, i = q' (r). R(sQ - CVo) + QIC = 0,
q = CVoe- tlCRC )
11.
37. IO[
100
+ -[
+
e- 6s )
100
= - - e- 2s , [
s
S2
1 - e- lOCt - 2 ) if t > 2
1= e- 2s ( S
s
]
+ 10
), i = 0 if t < 2 and
-lOs
App. 2
Answers to Odd-Numbered Problems
=
39. i
e- 20t
+
+ lI(t - 2)[ - 20t + 1 + 3ge-20Ct-2J]
5t
490e- [l - lI(t - 1)], i = 20(e- 5t - e- 250t )
20t - 1
41. O.Ii' + 25i =
+ e-250t+245]
43. i
(10 sin lOt
=
45. i'
+
+
2i
2
A17
f
+
20£1(1 - 1)[-e- 5t
+
100 sin t)(lI(t - 7T) - u(t - 37T))
t
i(T) dT
= 1 - 1I(t - 2), I =
(1 - e- 2S )/(s2
+
+
2s
2),
°
i = e- t sin t - lI(t - 2) e-t+2 sin (t - 2)
= 27 cos t + 6 sin t + lI(t - 27T) [- 27 cos t
47. i
e- t (27 cos 3t + 11 sin 3t)
- 6 sin t + e-(t-2TI"J(27 cos 3t
+
11 sin 3t)1
Problem Set 6.4, page 247
1. Y = 10 cos t if 0 < t < 27T and 10 cos t + sin t if t > 27T
3. Y = 5.5e t + 4.5e- t + 5(e t - 1I2 - e- t +1/2 )u(t - ~) - 50(e t - 1 - e-t+l)lI(t - 1)
5. Y = O.lfe t + e- 2 t(-cos t + 7 sin t)]
+ O.llI(t - lO)[-e t + e- 2t+ 3 0(cos (t - LO) - 7 sin (t - 10))]
7. Y = 1 + le- t sin 3t + lI(t - 4)[-1 + e-t+ 4 (cos (3t - 12) + l sin (3t - 12))]
l~ lI(t - 5)e -t+5 sin (3t - 15)
9. Y = 5t - 2 - 501l(t - 7T)e- t +7r sin 2t. Straight line. sharply deformed between 7T
and about 8
11. Y = (O.4t + 1.52)et
+
0.48e- 4t
+
1.611(t -
2)[ -et + e- 4t +1O ]
Problem Set 6.5, page 253
1. t
1
5. - sin wt
I
kt
7. 2k (e
w
9. ~(e3t - e- 5t )
ll.
13. t - sin t
15.
19. Y = 3/((S2 + 4)(S2 + 9», y = 0.3 sin 2t 21. (S2 + 9)Y = 4 + 8(1 + e-=)/(s2 + 1), y =
23. 0 if 0
<
t
<
1,
~
f
t
sin (2( T
-
1)) dT =
~(~ -
e-
-
kt
)
! cos 2t)
=
L
k
= ~
sinh kl
sin 2 t
t(cosh 3t - 1)
0.2 sin 3t
sin t + sin 3t if t < 7T, ~ sin 3t if t
-~ cos (2t
- 2)
+ ~ if t >
>
7T
1
1
25. Y = 2e- 2t - e- 4t
+
(e- 2t + 2 - e- 4t + 4)u(t t
27. Y - 1 * y = 1, y = e
31. Y(1 + l/s2) = lis, y = cos t
L)
+
(e- 2t + 4 - e- 4t +S )1I(t - 2)
29. y - y * sin t = cos t, Y = lis, y = 1
33. Y(1 + 2/(s - 1) = (s - 1)-2, Y = sinh t
Problem Set 6.6, page 257
4
1.
2
(s - 1)
+4
4s + 5)2
2w(3s 2 - w 2)
9. (S2 + W2)3
2s
5. (S2
+
3. (S2
2ws
+ w2)2
24s2 + 128
7. (S2 - 16y3
2s cos k
11.
+
(s
2
(S2 - I) sin k
+
2
1)
A18
App. 2
Answers to Odd-Numbered Problems
15. te- 2t sin t
19. In s - In (5 - 1); (e t
13.6te- t
2 kt
17. t e
-
l)/t
Problem Set 6.7, page 262
1. .'"1 = -e- t sin t, )"2 = e- t cos t
3 ..'·1 = 2e- 4t - 4e 2t , )"2 = e- 4t - 8e 2t
5'.'"1 = 2e- t + 4e- 2t +!t -~, )"2 = -3e- t - 4e- 2t - ! t + ~
7. )"1 = e- t (2 cos 2t + 6 sin 2t) + t 2• .'"2 = lOe- t sin '2t - t 2
9• .'"1 = 4 cos 5t + 6 sin 5t - 2 cos t - 25 sin t. .'"2 = 2 cos 5t - 10 sin 5t + 20 sin t
11'.'"1 = -cos t + sin t + I + 1I(t - 1)1-1 + cos (t - I) - sin (t - 1)1
)"2 = cos t + sin t -I + 1I(t - l)ll - cos (t - 1) - sin (t - 1)]
13. Y1 = 2u(t - 2)(e 4t - e t + 6 ), Y2 = e 2t + ll(t - 2)(e 4t - 3e2t + 4 + 2e t + 6)
15. Yl = _e- 2t + et + ~llU - I)(-e- 2t + 3 + et ), Y2 = _e- 2t + 4e t
+ ~lllt - l)l-e- 2t + 3 + et )
17')"1 = 3 sin 2t + 8e- 3t , )'2 = -3 sin 21 + 5e- 3t
19')'1 = e t - e- t , Y2 = et , Y3 = e- t
25.4i 1 + 8(il - i2) + 2i~ = 390 cos t, 8i2 + 8(i2 - i 1 ) + 4i~ = 0, il = - 26e- 2t
- 16e- 8t + 42 cos t + 15 sin t, i2 = -26e- 2t + 8e- 8t + 18 cos t + 12 sin t
Chapter 6 Review Questions and Problems, page 267
11.
17.
13.
(s - 3)2
S
(s - 1)(5 2
23. 10 cos
+ 4)
tV2
29. te- 2t sin t
19.
2
S(S2
+ 4)
15. e-m ( 7T
5
25. 3e- 2t sin 4t
31. (t2 - l)u(t - 1)
21.
s12 )
a-b
2S2
-4-5
- I
+
(5 - (1)(5 -
b)
27. lI(t - 2)(5 + 4(t - 2»
7T
33. ---:3 (wt - sin wt)
w
35.20 sin t + li(1 - 1)[1 - cos (t - I)]
37. 10 cos 21 - ~ sin 2t + 4u(t - 5) sin (2t - 10)
39. e- t (7 cos 3t + 2 sin 3t)
t
41. e- + u(t - 7T)[1.2 cos t - 3.6 sin t + 2e-t+7T - 0.8e2t - 2"]
43. u(t - l)(t - l)e 2t - 2 + 4u(t - 2)(2 - t)e 2t - 4
45. Y1 = e t + ~e-t - ~ cos t - ~ sin t, )'2 = -et + ~e-t + ~ cos t + ~ sin t
47. Y1 = ~e-t sin 2t, )"2 = e-t(cos 2t - ~ sin 2t)
49. )'1 = e 2t , )'2 = e 2t + et
51. I = (l - e- 2S )/[s(s + 10)], i = 0.1 (I - e- lOt ) + O.lu(t - 2)[-1 + e- lOt + 20 ]
53. I = e- 2t(76 cos 4t - 42 sin 4t) - 76 cos 20t + 16 sin 201
55. i~ + IOUI - i2) = 100 t 2• 30i~ + 1O(i~ - i~) + 100i2 = 0,
;1 = (~ + 4t)e- 5t + 10t2 -~, i2 = (~ + 2t)e- 5t + 2t - ~
Problem Set 7.1, page 277
App. 2
A19
Answers to Odd-Numbered Problems
-:l[-: -1]
[-0
3. Undef.,
~
6
o
9
5.
[-48 -2] [
38
-44
67
-15
.
-33
9.
24
72
60
smne,
. same.
-5.0
-4.9
1.8
-3.6
3.5
-48
-34]
3.8
0.4
[_~::]
-2.2
2X2 =
4
=
0
4X2
24
[-03
= -3
-5X2
-3Xl
48]
0
3(,
-4.2
+
+
-5Xl
-9
-12
7. [ 6~], [ :.2]'
, undef.
Problem Set 7.2, page 286
64
[ 230
-92
-92
38
-12
6
5. [20
-3
-114
-114
126
-1:]
-7], [-62
34
2],
[50S]
525 ,same
790
5
7.
r
:
0
~]
[-310
8 ,31,
-62
0
16
0
170
34
-124
337
8
9. [ 252
49
-100]
-308
52
233
-68
, same,
10]; ,same
68
257
68
232
97
-248
-16
[
-188J
-96
265
-72] ,
110
A20
App. 2
11. [
13.
19.
21.
23.
27.
Answers to Odd-Numbered Problems
324
32
2M
38
-244
-10
216
-104
280
-132
-280
140
-320]
[
-322.
366
4324
1520
[ 3636
-4816]
1242
-451!S
-3700
-1046
5002
7060
960
-5120]
7548
1246
-5434
-8140
-1090
6150
1M] [
-68,
76
83, 166, 593, 0
(d) AB = (AB)T = BTAT = BA; etc. (e) AilS. If AB = -BA.
Triangular are U 1 + U 2 , U 1 U2 , U 12 , L1 + L 2 , L 1 L 2 , L12.
[0.8 1.2]T, [0.76 1.24]T, [0.752 1.248]T
P = [110 45 801 T, v = [92000 863001 T
Problem Set 7.3, page 295
= 2.5, Y = -4.2
x = 0, y = - 2, z = 9
1. x
3. x = 0.2, y = 1.6
7. x = 4, Y = 0, z = -2
x = 3y + 2, y arb.,.: = -y + 6
11. Y = 2x + 3.: + I, x, .: arb.
w = I, Y = 2.: - x, x, .: arb.
15. w = 3, x = 0, y = -2, ;;: = 8
h = (R 1 + R2)Eo/(R 1R 2), 12 = Eo/R1' 13 = Eo/R2 [Amps]
11 - 12 - 13 = O. (3 + 2 + 5)1t + 10/2 = 95 + 35, 10/2 - 5/3 = 35, 11 = 8,
12 = 5. 13 = 3 Amps
21. Xl + X4 = 500, Xl + X2 = 800. X2 + X3 = 1100, X3 + X4 = 800, Xl = 500 - X4,
X2 = 300 + X4, X3 = 800 - X4, X4 arbitrary
5.
9.
13.
17.
19.
Problem Set 7.4, page 301
1. L [I
3.3, [I
[0 0
5. 2, [3
7.2, r8
9. 3, II
11. 4, [I
13. No
19. Yes
29. No
35. I, [5
-2]; [1
4
0
-3]T
7], [0 -2
0
I
3], rO
0
5
105]; r -2
-I-
5]T, [0
I
5]T,
I]T
0
0
0
0
5],
4],
3
0
[0 3
[0 2
0], [0
0], [0
41; [3 0 5]T, [0 3 4]T
OJ; [8 0 4 O]T, [0 2 0 41T
5 8 -371, [0 0 -74 296]; same transposed
o 0], [0 0 1 0], [0 0 0 I]; same transposed
15. No
17. Yes
21. (c) I
27.2, [I -I OJ, lO 0 1]
31. I, [-i l 1]
33. No
Problem Set 7.7, page 314
5. 107
9. -66.88
13. 113 + v 3 + w 3 - 311VW
19. x = -1.2, Y = 0.8, .: = 3.1
23.3
7. cos (a + f3)
11.0
15.4
21. 1
App. 2
Answers to Odd-Numbered Problems
A2l
Problem Set 7.8, page 322
1. [
1.80
3.
- 2.32J
-0.25
0.60
[COS '28
-sin 28J
sin 28
9. A-I
15. (A'j-L (A-'j' ~ [I: -~
cos 28
=A
11. No
inver~e
=]
19. AA -1 = I, (AA -1)-1 = (A -l)-lA -1 = I. Multiply by A from the right.
21. det A = - I. C 12 = C21 = C33 = - I, the other Cjk are zero.
23. uet A = 1. Cl l = I, C 12 = -2, C22 = 1, C 13 = 3. C23 = -4, C33 = I
Problem Set 7.9, page 329
1. Yes, 2, [3
5. Yes, 2, [0
5
0
-51 T
1 OIT, [0 0
OIT, [2
0
11. Yes, 2, xe-:r , e- x
13. [I O]T, [0 I]T; [I I]T, [-1
15. Xl = -0.6Yl + 0A.Y2
X2 = -0.8y! + 0.2.'"2
19. Xl = 5-"1 + 3.'"2 - 3)'3
X2 =
o
0
[_~ ~J
7. Yes, 1,
X3
3. No
0
3.'"1
+
= 2.'"1 -
2.1'2 -
)"2
+
l]T
9. No
I]T; [I
O]T, [0
17.
Xl =
x2
=
-l]T
2.1"1
5\"
. 1
+ Y2
+ 3\".2
2)"3
2.'"3
21. Vs6
25. 2
23. 16Vs
29. 4V1 -
3V2
= 0,
V
= ±
[53
15]T
Chapter 7 Review Questions and Problems, page 330
11. X = 4, Y = 7
15. x = ~. Y = -~. z = ~
19. x = 22. Y = 4, 2 arbitrary
23.638.0,0
13. x = v + 6, z = y, )" arbitrary
17.x=7.)"= -3
21. 0
25.
8.0
-3.6
1.2
-3.6
2.6
2.4
1.2
2.4
9.0
[
12
27. 14, 14.
2;
[
o
o
o
r
J
20
29. [-20
9
-3],
_:
]
A22
App. 2
Answers to Odd-Numbered Problems
31.2,2
33.2.2
35.2,2
37.
39.
4~ [~~ 1~
23
_ 5]
42
41.
-10
5\ [
4
-5
,~ r
:]
72
-72
31
-3:2
-19
20
132]
59
-35
43. It = 33 A, 12 = 11 A, 13 = 22 A
45. II = 12 A, 12 = 18 A, 13 = 6 A
Problem Set 8.1, page 338
1. -2, [1 O]T; 0.4, [0 I]T
3.4,2\"1 + (-4 - 4)X2 = 0, say, Xl = 4. X2 = 1; -4, [0 I]T
5. -4, [2 9]T; 3, [1 I]T 7.0.8 + 0.6i, [1 _i]T; 0.8 - 0.6i, [l i]T
9.5, [1 2]T; 0, [-2 I]T 11.4, [I 0 O]T; 0, [0 1 OlT; -I, [0 0 I]T
13. -(A3 - I8A2 + 99A - I62)/(A - 3) = -(A2 - I5A + 54); 3. [2 -2 I]T;
6, [l 2 2]T; 9, [2 I _2]T
15. I, [-3 2 lO]T; 4, [0 1 2]T; 2, [0 0 I]T
17. -(A3 - 7A2 - 5A + 75)/(A + 3) = -(A2 - lOA + 25); -3, [I 2 _I]T;
5, [3 0 l]T, [-2 I O]T
19. -(A - 9)3; 9, [2 -2 I]T; defect 2
21. MA 3 - 8A2 - 16A + I28)/(A - 4) = A(A 2 - 4A - 32); 4. [-1 3 1 l]T;
-4,[1 1 -I -l]T;O.[I I L 1]T;8, [1 -3 1 -3]T
23.2, [8 8 -16 l]T; 1, [0 7 0 4]T; 3, [0 0 9 2]T, -6, [0 0 0 l]T
25. (A + IfcA2 + 2A - 15); -I, [I 0 0 O]T, [0 1 0 O]T;
-5, [- 3 -3 1 l]T, 3, [3 -3 I -I]T
29. Use that real entries imply real coefficients of the characteristic polynomial.
Problem Set 8.2, page 343
1. [ -
~
~] ; -1, [~] ; I, [~] ; any point (x, 0) on the x-axis is mapped onto
(-x, 0), ~u that [l
3.
(x, y)
O]T is an eigenvector corresponding to A = -1.
maps onto (x, 0).
[~
~] ; 1, [~] ; 0, [~] . A point on the x-axis maps
onto itself, a point on the y-axis maps onto the OI;gin.
5. (x. y) maps onto (5x, 5)'). 2 X 2 diagonal matrix with entries 5.
7. -2, [I -l]T, -45°; 8, [1 1]T,45°
9.2, [3 -l]T, -18.4°; 7, [I 3]T.71.6°
11. I. [-l/V6 1],112.2°; 8, [1 1IV6]. 22.2°
13. 1. [l I]T, 45°; -5, [I _l]T, -45°
15. c[l5 24 50l T , c > 0
17. x = (I - A)-l y = [0.73 0.59 1.04]T (rounded)
19. [I I 1JT
21. 1.8
23. 2.1
App. 2
A23
Answers to Odd-Numbered Problems
Problem Set 8.3, page 348
5. A-I = (_AT)-1 = _(A- 1 )T
3. No
7. No since det A = det (AT) = det (-A) = (-1)3 det A = -det A = O.
11. Neither, 2, 2, defect 1
9. Orthogonal, 0.96 ± 0.28i
Symmetric,
18,
15.0rthugunal, 1, i, -i
Symmetric, 1I + 2b, a - b, 1I - b
13.
9, 1R
17.
Problem Set 8.4, page 355
2]T. [2
1.[1
- I]T, [1
3. [1
[~
-1]T:X=
2J
-1
D=
[7
0
:J
I]T, D = [:
:J
5. [2
7. [1
9. [0
_l]T, [2 1]T, diag (-2, 4)
0 O]T, [1 -2 1]T, [0 1 O]T, diag (I, 2, 3)
3 2]T, [5 3 O]T, [1 0 2]T, diag (45. 9, -27)
13. [~: -~:J: -5. [~J :2. [~J : = [-~J . [-~J
-72J.
[-3J
x
-30
15. [
4~
17'[-:: ~:
66
19. C = [
1
12
3
21. C =
[ -4
100
=
27. C
=
[
-3
12J ' IOY1 2
-6
-4J
, 5Y1 2
[ 16
~];
-2
15yl = 5, x = [0.8
0.6
-
5.\"22
-
= 0,
X
=
- 3
+ 5"_22 = 10,
\' 2
'.1
1 -6J , 7y
. 12
-6
1
12
-4
6
1:] 1~]
23.C~ [~ V3]
2
25. C
. [-3J
3
-1::];4,[_:]:_2'[_:]:1'[
-12
[=:]
F
[9J.
.2
. . O. [12J.. x =
-4
-5
32
5y
.22
-
16J ' 28Y1 2
12
-
x
0.6J y, hyperbola
-0.8
[2tV5
-ltv's
[
=
= 35 , x = [
112
- v 3/2
~r:;
1IV2
-l/V2
4yl = 112, x =
ltv's]
2tv's
y, straight lines
V312] y, ellipse
112
lIV2J
lIV2
N2J
[lIV2 1
ItV2 - lIV2
y, hyperbola
y, hyperbola
A24
App. 2
Answers to Odd-Numbered Problems
Problem Set 8.5, page 361
3. (ABC)T = CTBTj\T = C- 1(-B)A
5. Hermitian, 3 + v'2. [-i I - v'2]T; 3 - 0. [-; I +
7. Hermitian. unitary, 1, [1 ; - iV2]T; -1, [1 i + ;"\,'2]T
9. Skew-Hermitian. 5;. [I 0 O]T, [0 I I]T; -5i, [0 I -1]T
11. Skew-Hermitian, unitary, ;, [I 0 IJ T. [0 1 01 T: -i. [l 0 _lIT
13. Skew-Hennitian. -66i
15. Hermitian, 10
v2T
Chapter 8 Review Questions and Problems, page 362
[ 3 2J [3 2J = [5 OJ
11. [-1/3
2/3J A
[1
9.
-4
_3
2/3
oJ!lo ~
-9
A
-1/3
[
4.8
19. I.lY1 2 +
- 3
0
=
A
-9
-7
~J [~ -~J
2
;l rl-~ -:l
-1.0
15.
-4
-2
2
4
=
- I
=
0
0
17r~
OJ,s. -I
5.0
yl
rlo~ -: ~l
I. ellipse
Problem Set 9.1, page 370
1.
5.
9.
13.
17.
23.
27.
31.
2, -4,0; V20; [ltv's, -2tv's, 0]
-8. -6.0; 10; [-0.8, -0.6,0]
~, £); V37/8
[4, -2,0], [-2, 1,0], [-I,~, 0]
[2!S, -14, -14]
(5.5, 5.5, 0), (t, ~, IS'»
l-8, -2,4]; V84
[-9. 0, 0]. [0, -2, 0]. [0, 0, -11]. Yes.
ci,
35. [
~ , ~]
- [-
~ , ~]
3.
7.
11.
15.
19.
25.
29.
33.
= [
-1,0,5; V26: [-I/V26, 0, 5/V26]
(7. 5, 0); ViO
(0, 1, ~); V37/2
[10, -5, -15J
[-2, 1,8], [6, -3, -24]
[0,0,91; 9
v = [0. O. -9]
Ip + q + ul ~ 6. Nothing
~ , ~]
37. Iwl/(2 sin a)
Problem Set 9.2, page 376
1. 4
3. -v24T
5. [12. -8.4], [-18. -9. -36]
9. -4.4
11. -24
17·la
19.0
2
7. 17
IS. Use (I) and Icos
+ bl + la - bl 2 = aoa + 2aob + bob + (aoa - 2a°b + bob) =
21. 15
yl ~ 1.
2/a/2 + 2/b/ 2
23. Orthogonality. Yes
App. 2
A25
Answers to Odd-Numbered Problems
27. 79.11 °
25.2,2,0, -2
33.54.79°,79.11°,46.10°
31. 54.74°
39. 1.4
37.3
41. If lal = Ibl or if a and bare OIthogonal
29. 82.45°
35.63.43°.116.57°
Problem Set 9.3, page 383
1. [0,0, -I OJ, [0, o. 10J
3. [-4, -8, 26J
5. [0. 0, - 60J
7. -20, -20
9.240
11. [19, -21. 24], V1378
13. [10, -5, -I]
15.2
17.30, -30
19. -20. -20
25. [-2. 2. 0] x [4.4. OJ = -16k. 16
27. [I, -1.2] x [1,2,3] = [-7, -1, 3J, v'59
29. [0, 10, 0] x [4, 3, 0] = [0, 0, -40], speed 40
31.1[7. O. OJ x [I. 1.0]1 = 7
33. ~V3
35. [18,14,26]; 9x + 7.v + 13z = c, 9·4 + 7·8 + 13·0 = 92 = c
37. 16
39. c = 2.5
Problem Set 9.4, page 389
5. Circles
1. Hyperholas
3. Hyperbolas
7. Ellipses; 288, 100, 409; elliptic ring between the ellipses
and
9. Ellipsoids
11. Cones
23. [8x. 0, .\''::], [0,0, x.::], LO, 18;:, xy]; [0, ;:, yl, [z, O. x],
13. Planes
b', x,
0]
Problem Set 9.5, page 398
1.
5.
9.
13.
17.
23.
25.
[4 + 3 cos t, 6 + 3 sin t]
3. L2 - t. 0, 4 + t]
[3, -2 + 3 cos t. 3 sin f]
7. [a + 3t, b - 21, c + 5t1
11. Helix on (x - 2)2 + (y - 6)2 = r2
[v'2 cos t, sin t. sin t]
Circle (x - 2)2 + (y + 2)2 = 1, z = 5 IS. X4 + .l = 1
Hyperbola xy = 1
r' = [-5 sin 1. 5 cos t. 0], U = [-sin t. cos t. 0]. q = [4 - 3w. 3 + 4w. 0]
r' = [sinh t. cosh f], U = (cosh 21)-112 [sinh t, cosh t]. q = Li + 4H'. ~ + 5wJ
v;:r:;r
= cosh t. l = sinh I = 1.175
Stan from ret) = [t, f(t)].
v = r' = ll. 2t. O].lvl = VI + 4t 2 • a = [0. 2. 0]
v(O) = 2wRi, a(O) = -w2 Rj
I year = 365' 86400 sec, R = 30' 365' 86400121T = 151· 106 [km]. lal = w 2R
= Iv/ 2 JR = 5.98· 10-6 [kmlsec2 ]
39. R = 3960 + 80 mi = 2.133' 107 ft. g = lal = w 2 R = IvI 2 /R, Ivl =
=
V6.61· 108 = 25700 [ft/sec] = 17500 [mph]
43. ret) = [t, yet), 0], r' = [1, y', 0], r' • r' = I + y'2, r" = [0, y", 0], etc.
47. 3/(1 + 9t2 + 9t4 )
27.
29.
33.
35.
37.
ViR
A26
App. 2
Answers to Odd-Numbered Problems
Problem Set 9.6, page 403
1. w'
3. w'
= 2V2(sinh 4t)/(cosh 4t)1I2
= (cosh
t)sinh
t-\(cosh2 t) In (cosh t) + sinh 2 t)
7. e 4u sin 2 2v, ~e4u sin 4v
5. w' = 3(2t4 + t 8 f(8t 3 + 8t7 )
9. -2(u 2 + V 2)-3U, -2(u 2 + V 2 )-3V
Problem Set 9.7, page 409
1. [2x, 2y1
7. [6. 4. 4]
13. [-4, 2]
19. [6. 4]
23. [-0.0015. O.
29. [8, 6. 0]
35.7/3
3. [1Iy, - X/y2]
9. [-1.25. 01
15. [-18, 24]
21. [-6, -12]
-0.0020]
31. [108, 108, 108]
37.2e 2 /V13
41. X4
+ y3
5. Lv + 2. x - 2]
11. [0. -e]
17. [48, -36]
27. [a, h.
33.V2t3
c]
3~2
-
Problem Set 9.8, page 413
1. 3(x
+ y)2
3. 2(x
+ x~ +
5. (y
z)
+x +
1) cos xy
7.9x2y2.::2
9. [Vt, V2, V31 = r' = [x', y', ~'] = b', o. 0].::.' = 0, z = C3, y' = 0, y = C2,
x' = Y = C2, X = C2t + Cl' Hence as t increases from 0 to 1, this "shear flow"
transforms the cube into a parallelepiped of volume l.
11. div (w x r) = 0 because Vb V2, V3 do not depend on x, y, z, respectively.
13. (b) (fVl)x + (fV2)y + (fv 3 )z = ![lVI)x
(c) Use (b) with v = Vg.
15. 4(x + )')/(y - x)3
17.0
+
(V2)y
+ (v 3 )z] +
fxv]
+
fyV2
19. e XY\v 2z2
+ !zV3, etc.
+ X 2.: 2 + x2y2)
Problem Set 9.9, page 416
1. [0. 0, 4x - I]
3. [0, O. 2e x sin y1
5. [0, 0, -4y/(x 2 + )'2)]
9. curl v = [- 2~. 0, 01, incompressible, v = r' = [x'. y'. z'] = [0. .:2, 0].
x = CI, Z = C3, y' = Z2 = C32, Y = c32t + C2
11. curl v = [0, 0, -2], incompressible. x' = y, y' = -x, z' = 0, Z = C3,
)' dy + x dx = 0, x 2 + )'2 = C
13. Irrotational, div v = 1, compressible, r = ['let, C2e-t. C3etJ
17.0,0, [xy - .:x, yz - xy, ,:x' - yz]
19. 0, 0, 0, - 2.\'Z2 - 2.:x 2 - 2.\"y2
Chapter 9 Review Questions and Problems, page 416
11. [-1, 9, 24]
15. [0, 0, -740], [0, 0,
19. -495, -495
23. If u x v = 0. Always
27.3.4
-740]
13. 0, [-43, 54, 3], [43, -54,
17. [-24, 3, -398], [114, 95,
21.90°, 95.4°
25. [VI, V2, -3]
29. If 'Y > ~7T, ~7T
-3]
-76]
App. 2
A27
Answers to Odd-Numbered Problems
33.45/6
37. 0, 2y2
41. 0, 2X2
+ (z + xl
+ 4y2 + 2z 2 +
35. No
39. l-l, 1, - l], [-2z,
43.4881"\13323 45.0
4xz
-2x, -2y]
Problem Set 10.1, page 425
1.
5.
7.
9.
11.
15.
F(r(t»
F(r(t»
F(r(t»
F(r(t»
F(r(t»
17/3
= 1125t6 , t 3 , 0], 1644817 = 2350
3.0 + 160
= [cosh t sinh 2 t. cosh 2 t sinh tl, 93.09
cos t, sin t], 67T
lcosh ~t, sinh !t, ett8 ], 0.6857
= let, et2 , e t2 ], e 2 + 2e 4 - 3
17. [367T.
= It,
=
~(87Tl,
367T]
Problem Set 10.2, page 432
3• _12 e -cx2+y2) , 0
11. sinh ae
1. sin xy, 1
7. x 2 )' + cosh z, 392
15. cea - aeb
5. e Xz
13. No
19. No
17.~a2bc2
+ y, -2
Problem Set 10.3, page 438
f Ix 7. f ~(e3x
I + i)
1
x3
3.
-
(x 2 - x 5 )] dx
o
4
- e- X ) dx
5. ~ cosh 6 - cosh 3
= l2
= !e 12 + !e-4 -
o
1
(2x 2
11.
dx
-1
15.
=
i
19. Ix
=
(a
~
+ b)h 124, Iy =
~
(e Sin y cos y - cos y) dy
9.
0
1
0
x = ib, Y = ~h
3
f
13. f f
+~
x
0
= e- 2
~ d)' dx = ~
17. Ix = bh 3 /l2. Iy = b 3 h/4
h(a
4
-
4
b )/(48(a - b»
Problem Set 10.4, page 444
1. 2x 3 ), - 2.\}"3, 81 - 36 = 45
5. e'C- Y - e X + Y, -~e3 + !e 2 + e- 1 - !
9. 0 (why?)
13. Y from 0 to ~x. x from 0 to 2. Ans. cosh
15. Y from 1 to 5 - x 2 • Ans. 56
3. 3x 2 + 3y2, 18757T/2 = 2945
7. 2x - 2v, -56/15
11. Integrand 4. Ans. 407T
2 - ! sinh 2
19. 4e 4 - 4
Problem Set 10.5, page 448
1. Straight lines, k
3. x 2/a 2 + y2/b 2 = 1, ellipses. straight lines, [-b CllS v, a sin v, 0]
5. z = (cla)Y + y2, circles. straight lines, [-aeu cos v, -aeu sm v, a 2u]
7. x 2/9 + ."2116 = z, ellipses, parabolas. [-811 2 cos V, -6112 sin v, 12u]
2
9. x /4 + )'2/9 + z2/]6 = I, ellipses, [12 cos 2 v cos u, 8 cos 2 v sin u, 6 cos v sin v]
13. [lOu, IOv, 1.6 - 4u + 2v], [40, -20, LOO]
15. [-2 + cos v cos u, cos v sin £I, 2 + sin v],
2
[cos V cos u, cos 2 v sin u, cus v sin v]
r
A2B
App. 2
Answers to Odd-Numbered Problems
17. [Lt, v, 3v 2 ], [0, -6v, II
19. [u cos V, 311 sin v, 3111, [-911 cos v,
-3u sin v. 311]
21. Because r1< and rl" are tangent to the coordinate curves v =
respectively.
23. [iT. v, li 2 + v 2 ]. N = [-2u, -2v, 1]
COllsT
and
1I
=
COIlST,
Problem Set 10.6, page 456
1.-64
7.27r
15. 140V6/3
3. -18
9. ~a3
17. 1287TV213 = 189.6
19. 6 7T 2 (373/2 - 53/2 ) = 22.00
27. 7Th4/V2
29. 7Th + 27Th 3 /3
i
5. -1287T
11. 17h14
25. 27Th
Problem Set 10.7, page 463
1. 8a 3b 3c 3127
7. 2347T
13.7Th5 /lO
21. 0
3.6
9.2a 5 /3
17. 1087T
23.8
5. 42~7T
11. ha 4 rr12
19. 2167T
25. 3847T
Problem Set 10.8, page 468
1.
3.
5.
7.
Integrab ..j.. I . 1 (x = 1). ..j.· 1 . 1 (y = 1), -8' 1 . I (z = 1).0 (x = v =
2 (volume integral of 6y2), 2 (surface integral over x = 1). Others 0
Volume integral of 6)'2 - 6x2 is O. 2 (x = 1), -2 (v = 1), others O.
F = [x. y. z], div F = 3. In (2). Sec. 10.7. Fon = IFllnl cos cp
=
z
= 0)
Vx 2 + y2 + Z2 cos cp = r cos cp.
9. F = [x,
0,
0], div F
= 1, use (2*), Sec. 10.7, etc.
Problem Set 10.9, page 473
1.
3.
5.
7.
11.
15.
19.
[0,
16]0[0, -1, 1], :::':12
-ex. eY ] [-1. -1. 1], ±(e2 - 1)
S: [u, v, v 2 1, (curl F)oN = _4ve 2v2 , ±(4 - 4e 2 )
(curl F)on = 312, :::':3a 2 /2
9. The sides contribute a, 3a 212, -a, O.
curl F = [0, o. 6],247T
13. (curl F)on = 2x - 2y, 1/3
-7T/4
17. (curl F)oN = 7T(COS TTX + sin -n:v), 2
For = [-sin e, cos eJo[ -sin e, cos e) = I. 27T, 0
8z,
r-ez ,
0
Chapter 10 Review Questions and Problems, page 473
11. Exact. -542/3
17. By Stokes, ± 18rr
23. 0, 4a137T
29. By Gauss, 1007T
35. Direct, 5(e 2 - 1)
13. Not exact, e 4 - 7
19. By Stokes, :::': 127T
25. 817. 118/49
31. By Gauss, 40abc
15. By Green, 1152rr
21. 4/5, 8/15
27. Direct. 5
33. Direct, rrh
App. 2
A29
Answers to Odd-Numbered Problems
Problem Set 11.1, page 485
3.
13.
15
k, k,
2mn, 27T11l.
±+ ! (COS +
x -
7T
'2
9
7T
I
sin
- sin 2x
2
y -
19. - -4 ( cos x
7T
2 (sin x
I
21. "3
7T
I
23. 6
7T 2 -
t'
+ ..!..
2 (cos x
4
+
cos 5x -
9
7T
7T
+
+
+
cos 3x
+ ... )
+ 4 (cos x + ..!.. cos 3x + _1_ cos 5x + ... )
17. -
29.
kin
kill,
2
+
4
+ - sin 3x - + ...
3
+
+
sin 3x
+ -I
cos 5x
25
+
+ ... )
+ ... )
sin 5x
"4I cos LX + 9"1 cos 3x - + ... )
I
cos x - - cos 2x
2
-
7T
=
25
I
cos 3x
9
+ _1_ cos 5x + ... )
cos 3x
+ -1
4 ( cos x -
-
25
2x, f" = 2,jl =
4
1
277T
8
+ - - cos 3x + - cos 4x - ...
O,j~ = -47T,j~ = 0, an =
_1_ (n7T
..!..)
cos
(-47T)
117T,
etc.
11
Problem Set 11.2, page 490
1. :
3.
..!.. 3
~x
(sin
~
2
7T
+
(cos
i
sin
3~X
..!..
TTX -
cos
4
+
2+4
(cos
9. "3
7T2
3
11. 4
4
7T
(
2
( co:>
TTX
2
+ ..!.. cos
9
27TX
+ -1
9
3
1
+ -
2
cus 2r
15. Translate by!.
+
37TX -
+ ... )
+ -I- cos 47TX + -1- cos
3·5
cos
1
2
+ - cos
TTX
37TX
5·7
+ -1
25
cos
1
9
1
-8 cos 4x
17. Setx
37TX
2
57TX
= O.
67TX
+ ... )
+ ... )
47Tt"
+ - ... )
I
57TX
cos - 25
2
+ - cos - - + -
+ .. -)
13. 8
+ ... )
I cos
"41 cos 27TX + 9"1 cos 37TX - 16
TTX -
cos -
TTX
5~X
sin
27TX
5. Rectifier, -2-4
- (1
- - cos
7T
7T
1·3
7. Rectifier, -1 - 24
2
7T
+
1
cos 37lX
18
+ -
A30
App. 2
Answers to Odd-Numbered Problems
Problem Set 11.3, page 496
1.
3.
5.
7.
9.
Even, odd, neither, even, neither, odd
Odd
Neither
Odd
Odd
. 2
13.
~7r
+
11 7r
~7r
~9 cos 3x +
+
(cos x
~9 sin 3x +
(Sin x -
4 (
71X
15. 1 - --:;; sin ""2
~
+ ... )
_1_ sin 5x 25
37rx
"31 sin -2-
+
+ ... )
_1_ cos 5x
+
51
+ ...
+
8 (
7T.X
19. (a) 1 + 7r2 cos ""2
+ 9 cos -2- + 25 cos -2- + ...
4 (
21. (a)
n"(
sin 2
"23 -
23. (a)
"2L -
2
7rX
L
2
371X
1
1
9
7r
+
9
~
)
4
I
57rx
1
37rx
cos
3m:-
L
1
+ 25 cos
77rx
- --:;1 cos -2-
+ - ...
I
57rx
sin -2- -
"9 sin 371X
571X
)
L
+ ...
- "21 sin L27rx + "31 sin L37rx - + ... )
+ -4 ( cos x + -I cos
(b) 2 ( sin x
3
1
+
)
571X
- "3 cos -2- + 5 cos 2
4L (
7rX
7r2 cos
+ ...
37rx
I
)
+ -1 sin - + - sin 27rx + ...
1
7H
1
37rx
""2 - "3 sin 71X + "3 sin -2- + 5
2L (
71X
(b) --;; sin L
25. (a) -7r
2
+ -1 sin
7rX
2 (
7r cos ""2
6 (
(b) 7r sin
1
+
5nr
51 sin -2-
)
4 (
71X
17. (a) 1. (b) 7r sin ""2
(b) 7r
37rx
"31 sin -2-
57T.X
sin -2-
sin 2x
+
3x
+
+ - I cos 5x + ... )
sin 3x
25
+ . . .)
Problem Set 11.4, page 499
3. Use (5).
(-1)n
00
L...
9 .1· "
n=-oo
- - einx
'
n
2i
7 . - - ~ _ _ _ e(2n+ 1)ix
7r n=-oo 2n + 1
2
1)n
11. ~ + 2 ~ ---- einx
3
n2
n=-(X:
00
n*O
co
13. 7r
+
i ~
n=-x
(
)
+ ... )
App. 2
A3l
Answers to Odd-Numbered Problems
Problem Set 11.5, page 501
3. (0.0511)2 in Dn changes to (0.02n)2. which gives C5 = 0.5100. leaving the other
coefficients almost unaffected.
5. Y = Cl cos wt + C2 sin wt + A{w) cos t, A(w) = 1/(w2 - L) < 0 if w2 < I (phase
shift!) and> 0 if w 2 > I
N
7. y =
Cl
cos wt
+ C2 sin
wt
+
an
L
n=l
9. )' = c cos wt
.+ .. )
11. Y =
Cl
cos wt
+ c sin
,
+
C2
1
-
2
3· 5(w
16)
-
wt
sin wl
cos4t-
w
2
- n
+ -7T- + -4
26"
~
2
cos nt
(
1
-----0;---
cos t +
'u'-I
oo
-
1· 3(w2
cos 2t
4)
-
cos 3t
6i'-9
1
+ 2w2
1/9
•
13. The situation is the same as in Fig. 53 in Sec. 2.8.
3c
15. y = -
64
+
9c
2
cos 3t -
8
64 + 9c
2
sin 3t
Problem Set 11.6, page 505
±
1. F = 2 ( sin x -
sin Ir
+ ... +
4 (
1
3. F = -7T
2 - -7T cos x + -9 cos 3x
{_l)N+l
N
)
sin Nx , E*
=
8.1,5.0,3.6,2.8,2.3
1 cos 5x + . .. ) E* = 00748 00748
+ -25
....,
0.01 19, 0.01 19. 0.0037
5. F = -2-4
- (1
- - cos 2x + -1- cos 4x
7T
IT
1·3
3·5
E* = 0.5951, 0.0292. 0.0292, 0.0066, 0.0066
7. F
=
-
4 ( sin x
+ -1
7T
0.6243. 0.4206
9. -8 ( sin x
7T
+ -1
27
0.0015, 0.00023
sin 3x + -1 sin 5.1'
3
5
(0.1272 when N = 20)
sin 3x
+
1~5
sin 5x
+ -1- cos 6x + . .. ) ,
5·7
+ . .. ) . E*
+ ..
J
= 1.1902, 1.1902, 0.6243,
E* = 0.0295, 0.0295, 0.0015,
A32
App. 2
Answers to Odd-Numbered Problems
Problem Set 11.7, page 512
LX
1. f(x) = 7Te- x (x > 0) gives A =
e- V cos wv dv =
, B = __t_v----=-2
+ w2
1 + tr
(see Example 3), etc.
0
2
X
3.I(x) = !7Te- gives A = 1/(1 + w ).
5. Use f = (7TI2) cos v and (11) in App. 3.1 to get A = (cos (ml'!2»/(1 - w 2 ).
2
7. 7T
2
11. -
Icc
sin
7T
cos XW
--dw
2
9 -
• 7T
W
0
LX cos TTW + I
l-w
7To
2
17. -
(ltv
LX
tv
w2
0
LX
7To
7TH' - sin 7TW
sin
2
0
2
15. -
cos xw dw
2
I"" -cos- -w-+- , ,sin- -w--- - cos
XIV
2
19. rr
dw
IV
HI'
sin mv
2 sin xw dw
l-w
L wa cc
0
sin wa
2
sin xw dw
IV
Problem Set 11.8, page 517
1.
7.
{2 ( sin 2w -
V--;;
2 sin w )
5.
w
v:;;n cos w if 0 <
11. V(2/7T) w/(w
2
+
7T
tv
2
< 7T12 and 0 if w > 7T12
v:;;n e-x
(x> 0)
9. Yes, no
)
19. In (5) for f(ax) set a.>.: = v.
Problem Set 11.9, page 528
3. ik(e- ibw
7. [(I
- l)/(v'2;w)
iw)e- iw - 1]/(v'2;w2 )
+
11 ·21 e -w
13. (e ibw
5. V(2/7T)k (sin w)/w
9. V(21'7T)i(cos w - 1)/w
2 /2
-
e- ibw )/(i}vv'2;) = v'2/;-(sin bw)/w
Chapter 11 Review Questions and Problems, page 532
11. 4k (sin TTX
+
7T
13. 4 ( sin
15.
8 (
7T2
"2x- I"2
sin
!3
sin 31TX
sin x
+
2nt - '9I sin
+
!5
sin 57TX
+ ... )
'31 sin 23x - 4"1 sin 4x + "51 sin 25x 3nt
-2- +
I
25
5nr
sin -2- -
17. -2-4
- (1
- - cos 16rrx + -1- cos 327TX
7T
7T
1·3
3·5
+ ...
)
)
+ -1- cos
5·7
+ ...
481TX
+ ...)
dw
App. 2
A33
Answers to Odd-Numbered Problems
19.
7r 2
12 -
I l L
cos 6x + 16 cos 8 \" -
"4 cos 4x - 9
+
cos 2x
23. 7r 2 /8 by Prob. 15
21. rr/4 by Prob. 11
25.
1
"2 [f(x)
+
+ ...
1
"2 Lt(x) -
f(-x)1,
fe-x)]
27. rr - -8 ( cos -x + -I cos -3x
7r
2
9
2
+ -1
25
cos -5x + ... )
2
29.8.105,4.963,3.567,2.781,2.279, 1.929. 1.673, 1.477
31. y
C] cos wI + C2 sin wt
=
L"" (cos
1
7r
3w
+ w sin w - 1) cos
W
LX
sin
tV -
0
7r
(cos I
2
w - 1
1
4
-
-
•
cos 2t
_ 4
w2
+
1
9
cos 3t
w2
-
cos
H'
2
W
+
(sin w - w cos w) sin
2
tVX
dH'
.
••
•
Sin tU d~~
W
LX sin 2w -
4
37. -
tVx
w
0
2
35. 7r
4
--2 -
+ _ ... )
__1_. cos 41
16 w 2 - 16
33. -
7?
+
2w cos 2w
w
0,
3
39.
cos wx dl\'
~.----::--+4
,,-;
I\,2
t' ,
Problem Set 12.1, page 537
1. 1I
5. 1I
=
Cl(X)
=
c(x)e- Y
9. 1I
=
Cl(X»)'
cos 4)' +
+
+
C2(X)
eXY/(x
+
3.
7.
sin 4)'
1)
C2(.1'»),-2
15. C = 1/4
19. 7r/4
27. 1I = 110 - (lIO/in 100) In
(X2
+ )'2)
+
1I = Cl(X)
1I = c(x)
exp
C2(X)y
(h 2 cosh x)
11. u = c(x)eY + h(y)
17. Any C
21. Any C and w
29. 1I = ClX + C2(Y)
Problem Set 12.3, page 546
1. k cos 2m sin 2m;
3. '8k
3 ( cos 7r1 sin 7rX + -
5.
7r
27
cos 37r1 sin 37rx
+
4
57r
( COS 7r1 sin 7rX - -1 cos 37rf sin 37rx
9
+
--2
2
7. 7r 2
+
( cv'2
-
~ Cv'2 +
1) cos m sin 7rX
+
1~5
cos 57rISin57r,+ ... )
_1_ cos 57rf sin 57rx 25
"21 cos 2m sin 27rx
1) cos 3m sin 31TX - .• -)
+ ...)
9
A34
App. 2
Answers to Odd-Numbered Problems
9. 22
~
( (2 - V2) cos m sin = - -I (2
_1_ (2
25
it =
3=
+
+ V2) cos 5m sin 5= + ... )
2
17.
+ v'2) cos 3m sin
9
8L
~3
(
cos
[(~)2J
C
L t
L7TX
sin
1 cos [(3~)2J
33
C
L t
+
sin
371:\ + ...)
L
19. (a) u(O, t) = 0, (b) lI(L, t) = 0, (c) 1l:t"(0. t) = 0, (d) uxCL, t) = O. C = -A,
D = -B from (a), (b). Insert this. The coefficient determinant resulting b'om (c),
(d) must be zero to have a nontrivial solution. This gives (22).
Problem Set 12.4, page 552
3.
11.
13.
15.
17.
19.
c 2 = 300/[0.9/(2·9.80)] = 80.83 2 [m2 /sec2 ]
Hyperbolic. u = fl(X) + f2(X + y)
Elliptic, U = fl(Y + 3ix) + f2(Y - 3ix)
Parabolic, 1I = Xfl(X - Y) + f2(x - y)
Parabolic. 1I = Xfl(2x + y) + f2(2x + y)
Hyperbolic, u = (l/y)fl(XY) + f2(Y)
Problem Set 12.5, page 560
5.
It
= sin OAnt e-1.752.167T2t/lOO
7.
II
=
9. u =
11.
II
=
! (~
sin O.lnt
2:~
(sin 0.17T.t e-O.017527T2t
til
+ Un. where
Un
=
II -
UI
~
+
e-O.017527T2t
+
sin 0.27TX e-O.01752(2m2t - . . . )
~
sin 0.3n\" e-O.01752(37T)2t - - ... )
satisfies the boundary conditions of the text. so
fL
oc
117T.t
2
2
that u II = L...
~ B sin - - e-lCn7TIL) t B = n'
L
. II
L
n~1
0
13. F = A cospx + B sinpx, F'(O)
p = n~/L, etc.
15. u = 1
17. II =
2;2 +
~2
19. u = 12
23 . - -K~
L
l1~l
[f(x)
(x)] sin -L
- dx
. - II I'
~
4 (cos x e- t -
= Bp = O. B = 0, F'(L) = -Ap sinpL = 0,
cos 2x e- 4t
+
i
I
I
4
9
cos 3x e- 9t -
+ ... )
+ cos 2,' e- 4t + - cos 4x e- 16t + - cos 6x e- 36t + ...
.
L
OC
nB II e- A " 2 t
25. w
= e-{3t
1I~1
27. v t - c 2v xx =
0,
w"
=
-Ne- ux/c 2, w
=
C~2
[_e- ux -
~
(J - e-uL)x
so that w(O) = n'(L) = O.
29.
1I
= (sin i~x sinh ~~y)/sinh
1T
80 ~
31. II = -
L...
~ n~l
(2n - 1)=
(211 - 1) sinh (211 -
l)~
Sll1
24
(211 - 1)7TV
sinh - - - - 24
+
1] '
App. 2
A35
Answers to Odd-Numbered Problems
33.
= Aox +
II
=
Ao
I
-2
24
L
sinh (I171:T/24)
117TY
An - - - - - cos -24 '
sinh 117T
f
fey) dy,
24
117TX
oc
35. ~ An sin - - sinh
f24
0
117T(b - y)
a
n~l
I
12
= -
An
0
, An
a
117TV
f( \") cos - - ' dv
24-
=
2
fa
a sinh (l17Tb/a)
0
117TX
f(x) sin - - dx
a
Problem Set 12.6, page 568
2 sin ap
1. A =
, B = 0,
fX -sin-ap- cos px e- c
2
2
= -
II
rr
~
L cos
2
p t
dp
p
0
oo
= e- P , B = 0, u =
3. A
px
dp
e-p-cZp2t
o
5. Set
= s. A = I if 0 <
TTV
p/7T
<
I, B
= 0, u
{T
=
cos px
e-c2p2t
dp
o
~
7. A
2[cos P
+ P sin p
1)/(~2)]. B
-
O.
=
f"
u =
A cos px e-c2p2t dp
o
Problem Set 12.8, page 578
1.
3.
5.
7.
11.
(a), (b) It is multiplied by v'2. (c) Half
= 16/(ml17T2) if 111. 11 odd. 0 otherwise
Bmn = (-l)n+18/(mn7T2) if 111 odd, 0 if m even
Bmn
=
Bmn
(_l)m+n4/(mn7T2)
k cos v'29 1 sin 2x sin 5y
6.4 "" ""
I
13. - 2 ~ ~ -----:33 cos (1
m~l n~l 117 II
111, n odd
7T
17.
V
117
2
+
2
/1 )
sin
l/IX
sin
11)"
(colTesponding eigenfunctions F 4,16 and F 1614 ), etc.
0 (m or 11 even). Bmn = 16k/(l1l1l7T2 ) (m, 11 odd)
C7TY'260
19. Bmn
=
21. Bmn = (-l)m+nI44a 3 b 3 /(m 3 11 3 1fl)
9
23. cos ( 7Tt
2
a
+ 216)
47TV
37TX
sin - - sin - - '
a
b
b
Problem Set 12.9, page 585
7. 30r cos 8 + 10,-3 cos 38
9.55
220
+ -
( r cos 8 - -1 ,.3 cos 38
3
7T
11.
7T _
2
~
7T
(r cos 8 + ~9
,.3
cos 38
15. Solve the problem in the disk,. <
and -lio on the lower semicircle.
u
=
4uo
7T
(!..
a
sin 8
17. Increase by a factor
+
~
3a
v'2
r3
a
+ -1
5
+
_1_
25
r 5 cos 58 ,.5
cos 58
:-.ubject to
sin 38
+
Uo
~
r
5a
19. T
5
+ ... )
+ ... )
(given) on the upper semicircle
sin 58
+ ...)
= 6.826pR2 f1 2
A36
App. 2
Answers to Odd-Numbered Problems
21. No
23. Differentiation brings in a factor lIAm = RI(eCi m ).
Problem Set 12.10, page 593
11. v
=
F(r)C(t), F"
+
k 2F
= 0, G +
e 2k 2C
2
R
C n = Bn exp (-e 21l 27T 2tIR2), Bn = -
= 0,
fR
Fn
= sin (I1'mfR),
WITT
rf(r) sin - - dr
R
0
13. u = 100
15. u = ~r3P3(CoS 1;) - ~rPl(cos 1;)
17. 64r4P4(COS 1;)
21. Analog of Example 1 in the text with 55 replaced by 50
23. v = r(cos ())lr2 = r/(x 2 + )"2), V = xyl(x 2 + )"2)2
Problem Set 12.11, page 596
e(s)
= -
X
S
+
x
, W(O, s) = 0, e(s) = 0, w(x, t) = x(t - 1 + e- t )
+ 1)
7. w = f(x)g(t), xI' g + fi = xt, take f(x) = x to get g = ce- t + t - 1 and
e = 1 from w(x, 0) = x(e - L) = o.
9. Set x 2/(4c 27) = Z2. Use z as a new variable of integration. Use erf(x) = I.
5. W
S2(S
Chapter 12 Review Questions and Problems, page 597
19. u
=
+
el(y)e X
c2(y)e- 2X
21. u = g(x)(1 - e- Y ) + f(x)
25. u = ~ cos 1 sin x - -! cos 3t sin 3x
23. u = cos t sin x - ~ cos 21 sin 2x
27. l/ = sin (0.02'i1:\") e-0.0045721
200
29 u = 7T 2
•
31. u
=
(7TX
sin 50
·
e-0.004572t -
37T.\"
-1 sin - e-0.04115t
9
50
100 cos 4x e- 16t
7T
16
33. u = - - 2
7T
(1-
4
cos 2x e- 4t
+ -
1
36
cos 6.1: e- 36t
+ ... )
+ -I- cos lOx e- lOOt
100
+ ... )
37. u = fl(Y) + f2(X + y) 39. II = fl(Y - 2ix) + f2(Y + 2ix)
41. l/ = Xfl(Y - x) + f2(Y - x)
49. II = (111 - 1l0l(1n r)lIn (rl/ro) + (110 In rl - Ul In ro)lIn (rl/rO)
Problem Set 13.1, page 606
5. x - iy = -(x + iy), x = 0
9. -5/169
11. -7/13 -(22/l3)i
15. -7/17 - (l1117)i
17.xl(x2 + y2)
7.484
13. - 273 + 136i
19. (x 2 - y2)/(x 2 + y2)2
Problem Set 13.2, page 611
1. 3V2(cos (--!7T) + i sin (-!7T))
3. 5 (cos 7T + i sin 7T) = 5 cos 7T
1
5• cos "27T
. 1
+ I.sm
"27T
App. 2
Answers to Odd-Numbered Problems
A37
9. -37r/4
7. ~v'6I (cos arctan ~ + i sin arctan ~)
15. 37r/4
11. arctan (±3/4)
13. ±7r/4
17.2.94020 + 0.59601i
19.0.54030 - 0.84147i
21. cos (-~7r) + i sin (-~7r), cos ~7r + i sin ~7r
23. ±O ±i)rV2
25. -I, cos!7r ::!: i sin !7r, cos ~7r ± i sin ~7r
27. 4 + 3i, 4 - 8i
29. ~ - i, 2 + ~i
35. 1<:1 + z21 2 = (:1 + Z2)(Zl + Z2) = (:1 + z'2)(':1 + ':2)' Multiply out and use
Re <:1':2 ~ IZ1z21 (Prob. 32):
2
:1':1 + :1':2 + :2':1 + ::2':2 = hl + 2 Re :1':2 + IZ212 ~ 1:112 + 2blk:21 + IZ212
= (1:11 + IZ21)2.
Take the square root to get (6).
Problem Set 13.3, page 617
1.
3.
5.
9.
13.
15.
19.
23.
Circle of radius ~, center 3 + 2i
Set obtained from an open disk of radius I by omitting its center z = 1
Hyperbola xy = 1
7 . .v-axis
The region above y = x
f = I - 11(::; + 1) = 1 - (x + 1 - iy)/l(x + 1)2 + y21; 0.9 - O.li
(x 2 - y2 - 2.ixr)/(x2 + y2)2, -i/2
17. Yes since r2(sin 2e1/r ~ 0
Yes
21. 6;::2(;::3 + i)
2i (1 - Z)-3
Problem Set 13.4, page 623
5. Yes
1. Yes
3. No
9.Yesforz*0
7. No
11. rx = x/r = cos e"y = sin e, ex = -(sin e)/r, ey = (cos e)/r,
(a) 0 = Ux - Vy = It,. cos e + tllI(-sin e)/r - Vr sin e - vo(cos e)/r.
(b) 0 = lty + Vx = 1I,. sin e + lle(COS e)/r + Vr cos e + v e ( -sin e)/r.
Multiply (a) by cos e, (b) by sin e, and add. Etc.
13. z2/2
15. In Izl + i Arg :
17. Z3
19. No
21. No
23. c = I, cos x sinh y
27. Use (4), (5), and (1).
Problem Set 13.5, page 626
3. -1.13120 + 2.47173i, e = 2.71828
5. -i, 1
9. e- 2x cos 2.y, _e- 2T sin 2.y
7. eO.s(cos 5 - i sin 5), 2.22554
2
2
11. exp (x - )"2) cos 2xy. exp (x - y2) sin 2.\)"
13. e i7r/4 , e 5m/4
15. Vr exp [ice + 2k7r)/Il], k = 0," ',11 - 1
17. ge m
19. z = In 2. + tri + 21l7ri (11 = 0, ±l,"')
21.: = In 5 - arctan ~i ± 21l7ri (11 = 0,1,"')
Problem Set 13.6, page 629
3. Use (II), then (5) for e iy , and simplify. 5. Use (II) and simplify.
7. cos 1 cosh 1 - i sin 1 sinh 1 = 0.83373 - 0.98890i
A38
App. 2
Answers to Odd-Numbered Problems
11. -3.7245 - 0.51182i
15. cosh 4 = 27.308
19. z = ~(21l + 1)7T - (-l)n1.4436i
9.74.203, 74.210
13. -1
17. z = :::':::(211 + 1)7Ti/2
21. ::: = :::':::lI7Ti
25. Insert the definitions on the left, multiply out, simplify.
Problem Set 13.7, page 633
1.
5.
7.
11.
15.
17.
19.
21.
25.
In 10 + 7Ti
3. ~ In 8 - ~7Ti
In 5 + (arctan ~ - 7T)i = 1.609 - 2.214i
0.9273i
9. ~ In 2 - ~7Ti
:::':::(211 + I )7Ti. 11 = 0, I. . . .
13. In 6 :::'::: (21Z + 117Ti, 11 = O. 1. ...
(7T - I :::'::: 2117T)i, Il = 0, 1, ...
In (;2) = (:::':::211 + i)7Ti. 21n i = :::':::(411 + l)'ITi. n = O. 1. ...
3
eO. (cos 0.7 + i sin 0.7) = 1.032 + 0.870i
2
e (l + i)/V2
23. 64(cos (In 4) + i sin (In 4»
2.8079 + 1.3179i
27. (I + ;)/,\0.
Chapter 13 Review Questions and Problems, page 634
19. -~ - ~i
25. 12e- 77i/2
31. fez) = liz
37. (-x 2 + )"2)/2
43.0.6435i
17. -32 - 24i
23. 6V2e37ri/4
29. (:::':::1 :::'::: i)/V2
z2
35. f(:::) = e
41.0
21.5 - 3i
27. :::':::(2 + 2i)
33. fez) = (l + i):::2
39. No
45. -1.5431
Problem Set 14.1, page 645
1.
3.
7.
9.
Straight segment from I + 3i to 4 + 12i
Circle of radius 3, center 4 + i
5. Semicircle. radius 1. center 0
Ellipse, half-axes 6 and 5
from - 1 - ~i to 2 + 4i
Parabola)' =
¥3
11. e- it (0 ~ t ~ 27T)
13. t + ilt (1 ~ t ~ 4)
17. -Q - ib + re- it (0 ~ t ~ 27T)
21. 0
25. il2
29.2 sinh ~
15. t + (4 - 4t2 )i (-1 ~ t ~ I)
19.~ + ~i
23. 7Ti + ~i sinh 27T
27. -I + i tanh ~7T = -I + 0.6558i
Problem Set 14.2, page 653
1. 7Ti. no
3.0, yes
7. O. yes
9.0, no
15. Yes, by the deformation principle
21. 7Ti
23.27Ti
27. (a) 0, (b) 7T
29.0
5.0, yes
11. O. yes
19. 7Ti by path deformation
25. 0
App. 2
A39
Answers to Odd-Numbered Problems
Problem Set 14.3, page 657
1.-4~
7.0
13. ~
17. ~i cosh2(l
5.8m
3.4~
+
11.m
9. -~i
15. 2m Ln 4 = 8.710i
i) = ~(-0.2828 + 1.6489i)
Problem Set 14.4, page 661
1. 27~i/4
7. 2~i if lal < 2.0 if lal
11. 2~2i
5. ma 3 /3
>2
9. m(cos i-sin ~)
13. ~iea/2124 if la - 2 - il < 3 0 if la - 2 - il > 3
3. -2~ierr/2
Chapter 14 Review Questions and Problems, page 662
17. -6~i
23. ~i sin 8
21.
27.
19.0
25.0
-~i
~
29.0
Problem Set 15.1, page 672
Bounded, divergent, ± J
3. Bounded, convergent, 0
Unbounded
Bounded. divergent, ±llV'2" ± i, O. I, -2
Convergent, 0
13. IZn - II < ~E, Iz~ - [*1 < ~E (n > N(E)), hence IZn + z~ - (l + [*)1 < ~E + ~E
17. Convergent
19. Divergent
21. Conditionally convergent
23. Divergent by Theorem 3
27.11= 1100 + 75il = 125 (why?); 1100 + 75iI 125/125! = 125125/[\h50~ (l251el 25J
= e125tV250~ = 6.91 . 1052
1.
5.
7.
9.
Problem Set 15.2, page 677
1. ~ a n z2n = ~ a n(z2)n, Iz21 < R = lim lanlan+ll, hence Izl <\'R.
3. -i, I
5. -1, e by (6) and (1 + I1n)n ~ e.
7.0, Iblal
9. O. 1
11.0, 1
15. i, 1IV2
17. 0, V2
13. 3 - 2i. 1
Problem Set 15.3, page 682
3. V2
9. ]
1. 3
7. 1/\/7
5.
V5i3
Problem Set 15.4, page 690
1. 1 - 2z + 2z 2 - ~Z3 + ~Z4 - + .. ', R =
3. e- 2i (1 + (z + 2i) + hz + 2i)2 + t(z + 2i)3
5. I - ~(z - ~~)2
7. ~
+ ~i + !i(z
+
- i)
l4(Z - !~)4 - 7~O(Z
+
(-~
+ ~O(Z
-
- ;)2 -
00
+
+ 2i)4 + ... , R = 00
!~)6 + - ... , R = oc
~(z - ;)3 + - ... , R = V2
l4(Z
A40
App. 2
Answers to Odd-Numbered Problems
+ t:: 4 - .i8::.6 + 3~4<.8 - + .. " R = x
11.4(:: - 1) + \O(z - 1)2 + 16(:: - 1)3 + 14(:: - \)4 + 6(::
13. (2/v;.)(:: - z3/3 + ::5(2!5) - z7/(3!7) + ... ), R = GC
15. ;:3/(l!3) - ::7/(3!7) + ;:1l/(5!1l) - + . ". R = x
19.:: + ~Z3 + 1~::5 + i{5:: 7 + . ", R = !7T
9. 1 - !::2
- 1)5
+
(z - 1)6
Problem Set 15.5, page 697
3. R = \[\1'; > 0.56
7. Itanhn Izll ~ I, 1/(/1 2 +
11. Izl ~ 2 - 8 (8 > 0)
15.lzl ~ \,'5 - 8 (8) 0)
1. Use Theorem 1.
5. Iz nl ~ 1 and ~ 1//1 2 converges.
9.
Iz +
<R=
L - 2il ~ r
4
13. Nowhere
\) <
\//1
2
Chapter 15 Review Questions and Problems, page 698
13. 1. ! Ln[(1 + z)/(1 - .:)]
17. 1/v;., [l - 7T(Z - 2i)2r 1
11. x, e 3 -
15.
GC
19. 1/3
21. -1 - (::. - 7Ti) - (:: - 7Ti)2/2! - .. "
23.! + ~(:: + I) + t(z +
R = x
1
1 6(Z
1)2 +
+ 1)3 + . ", R = 2
+ 3;: + 6;:2 + 10;:3 + .. '. R = I. Differentiate the geometric
+ (:: + i) - i(z + i)2 - (z + i)3 + .. " R = 1
25. \
27. i
1
+
29. -(:: - 2 7T)
3!\ (:: -
13
27T) -
1
5! (;: -
15
27T)
+ - .. '. R =
series.
ex;
Problem Set 16.1, page 707
\
1. .:4
1
1
1
+ -::3 + -Z2 + -z + 1 + z +
1111
+ - - Z2
2;::
6
3.
-3 -
5.
-_3
I
24 .(.
+ -
Z
\
1
+ -
::5
1
2::7
ez-1e
7. - - = e
Z -
9. -
\
(i
2:"2
x
n~O
1
6;:9
+ oc
+ -
(::: -
2:
n~O
I (.: -
12
120 ~
I)n-l
= e
i)n-l = -
R
= \
+ - ... R =
,
[1
-Z- \
+
1
z- \
+ - - +"'],R=X
2!
il2
\;
--=-=-=+ "4 + -8 (:: <.
i) -
I
R=2
x
11. -
2: (.: + ;)n-l =
-
n~O
3
13. - - z- 1
1
~
Z
+
2
+
(z -
\)
I
x
= ox
'
/1!
+ .. '.
- -
+ ... R
n
-l-l
Z2
- \ - (.: + ;) - .. " R =
1
1
16
(:: -
;)2 -
.. "
App. 2
A41
Answers to Odd-Numbered Problems
15.
L
-L
< I.
:x;
1:::1
;::4n+2,
< I,
> 1
~
l
x
-L
4n+2'
Izl >
1
'11=0 Z
'>1=0
19.
Izl
_3n+3'
n~O
17. L
1
Xl
~3n, 1::1
i
I
+ - - + i + (;:: ~ -
(::: - i)2
i
L ~4n, Izl <
x
21. (I - 4:::)
i)
(4 I) L I
1,
3
n=O
x
- 'I
Z
Z
4n'
n~O Z
Izl >
I
Problem Set 16.2, page 711
1. :::,:::~. :::':::~, ... (poles of 2nd order), :x (essential singularity)
3.0, :::,:::v:;.;.. :::,:::\12;. ... (simple poles), :x (essential singularity)
5. :x (essential singularity)
7. :::'::: I, :::':::i (fourth-order poles), :x (essential singularity)
9. :::':::i (essential singularities)
13. -16i (fourth order)
15. :::'::: 1, :::':::2, ... (third order)
17. :::':::irV3 (simple)
19. :::':::2i (simple), 0, ::':.2 m, :!::4'ITi, ... (second order)
21. O. :::':::271. :::':::471• ... (fourth order)
23. f(~) = (~ - ~o)ng(:::), g(zo)
"* 0, hence p(:~) =
(::: - ;::0)
2n
l(z).
Problem Set 16.3, page 717
1. i. 4i
5. 1/5! (at::: = 0)
(at ~ = I), ~ (at z = -1)
9.
15. el/z = 1 + 1/;:: + ... , AilS. 2m
19. -4'ITi sinh ~7I
-!
23.0
3. -~i (at ~ = 2i), ~i (at - 2i)
7. I (at :::':::ll'1T)
11. -1 (at z = :::':::~71", :::':::~71", ... )
17. Simple poles at :!::~. Am. -4i
21. -4i sinh ~
25. ~ (at ~ = ~), 2 (at ~ = ~). Ans. 5m
Problem Set 16.4, page 725
1. 271/\;TI
7.0
13.71"/16
19.0
25.0
5. 271/3
3.271/35
9. 1f
15.0
21. 0
27. -71"/2
11. 271/3
17.71"12
23.71"
Chapter 16 Review Questions and Problems, page 726
17. 271"i/3
23. m/4
27.6m
33.0
19. 571"
25.0 (11 even),
29. 271/7
35. 71/2
(_I)<n-1)/2
21. ~'IT cos 10
271"i/(/1 - I)! (/1 odd)
31. 4'IT/V3
A42
App. 2
Answers to Odd-Numbered Problems
Problem Set 17.1, page 733
3. Only the size
5. x = e, W = -y + ie, y = k. w = -k
7. -371"/4 < Arg w < 31714, Iwl < 1/8
11.lwl ~ 16, v ~ 0
15. In 2 ~ II ~ In 3, 71"/4 ~ v ~ 17/2
19.0, ±1, ±2, . . .
+
ix
9.lwl > 3
13. Annulus 3 <
17. ±1, ±i
21. -a/2
Iwl
< 5
23. (/ and 0, ~
25. M = eX = 1 when x =
27. M = 1IIzi = 1 on the unit circle, J = 111<:12
o. J =
e
2x
Problem Set 17.2, page 737
_ __
-
~.~.
-nv
-2w + 3
9. z = 0
13. z = ±i
17. (/ - d = 0, ble = 1 by (5)
5i
7.;::= - - 4w - 2
n. <: = ~ + i ± Vi + i
15. w = 4/z, etc.
19. w = add (a *- 0, d *-
0)
Problem Set 17.3, page 741
5. Apply the inverse g of f on both sides of Zl = f(zl) to get g(Zl)
7. w = (<: + 2i)/(z - 2i)
9. w = z - 4
11. w
=
15. w =
19. w =
liz
(z + 1)/(-3z + I)
4
(Z4 - i)/(-i: + 1)
=
13. w = (3iz + 1)lz
17. w = (2z - i)/(-iz - 2)
Problem Set 17.4, page 745
1. Annulus 1 ~ Iwl ~ e 2
3. liVe < Iwl < Ye, 371"/4 < arg w < 571"/4
5. 1 < Iwl < e. u > 0
7. w-plane without 0
9. u 2 /cosh 2 1 + v 2 /sinh 2 1 ~ 1, u ~ 0
11. Elliptic annulus bounded by u2/cosh2 I + v 2 /sinh 2 1 = I and
u 2/cosh 2 5 + v 2 /sinh 2 5 = \
13. ::!:(211 + 1)17/2, II = 0, \, ...
15.0 < [m t < 71" is the image of R under t = Z2. AilS. e t = e z2
17. O. ±i, ±2i, ...
19. 112/cosh 2 1 + v 2 /sinh 2 1 ~ \, v < 0
21. v < 0
23. -I ~ l/ ~ I. v = 0 (c = 0). u 2 /cosh2 e + u2 /sinh 2 e = I (e *- 0)
25. In 2 ~ l/ ~ In 3, 71"/4 ~ v ~ 7[/2
Problem Set 17.5, page 747
1. w moves once around the unit circle.
7. - i/2, 3 sheets
g(f(zl»
5. - 5/3, 2 sheets
9. 0, 2 sheets
=
Zl·
App. 2
Answers to Odd-Numbered Problems
A43
Chapter 17 Review Questions and Problems, page 747
11.
15.
17.
23.
27.
33.
39.
45.
1I = ~V2 I, ~V2 - I
The domain between II
Iw + ~I = ~
0, (:::'::: 1 :::'::: i)/V2
0, :::':::i/V2
tV = zi(z + 2)
I + i : :': : v'1+2i
iz 3 + 1
=
13.lwl = 20.25, larg wi < n12
~ - v 2 and It = 1 - ~V2
19. 1I = 1
21.larg wi < 7T/4
25. n/8 :::'::: 117T/2. 11 = O. I, ...
29. tV = iz
31. w = liz
35. ::!:V2
37. 2 : :': : v'6
41. w = e3z
43. z2 12k
Problem Set 18.1, page 753
3. 110 - SOx)" 110 + 25iz 2
1. 20x + 200, 20z + 200
7. F = 200 - (lOOlln 2) Ln z
5. F = (1lO/ln 2)Ln z
13. Use Fig. 388 in Sec. 17.4 with the z- and w-planes interchanged, and
cos z = sin (z + !7T).
15. <1> = 220 - 110xy
Problem Set 18.2, page 757
2 )') c]:>
2
2
2
1• 1I 2 - v 2 = e2 X(cos 2 ]
\' - sin,
xx = 4e X(cos )' - sin .v) = -c]:> YY' V2c]:> = 0
3. Straightforward calculation, involving the chain rule and the Cauchy-Riemann
equations
5. See Fig. 389 in Sec. 17.4. c]:> = sin 2 x cosh2 J - cos 2 x sinh 2 y.
9. (i) c]:> = U1(l - "\y). (ii) w = iz 2 maps R onto -2 ~ 1I ~ O. thus
<1>* = U1(l + !1I) = U1(l + -2x),».
11. By Theorem 1 in Sec. 17.2
13. <1> = 10[1 - (lin) Arg (z - 4)J, F = lOll + (i/n) Ln (z - 4)1
15. Corresponding rays in the w-plane make equal angles, and the mapping is
conformal.
i(
Problem Set 18.3, page 760
3. (lOO/d»'. Rotate through 7T12.
5. 100 - 2408/7T
7. Re F(z) = 100 + (200/n) Re (arcsin z) 9. (240/n) Arg z
11. To + (2/7T)(TI - To) Arg z
13.50 + (400/7T) Arg z
Problem Set 18.4, page 766
1. V = iV2 = iK, 'It = - Kx = canst, c]:>
3. F(z) = Kz (K positive real)
5. V = (I + 2i)K. F = (l - 2i)K::.
7. F(z) = Z3
= Ky = canST
9. Hyperbolas (x + I)y = COllst. Flow around a corner formed by x
x-axis.
11. Y/(X2 + .1'2) = C or X2 + (y - k)2 = k 2
13. F(z) = ::.Iro + rolz
= -} and the
A44
App. 2
Answers to Odd-Numbered Problems
z with the roles of the z- and w-planes
15. Use that w = arccos Z is w = cos
interchanged.
Problem Set 18.5, page 771
5. I - ,.2 cos 28
7. 2(r sin 8 - !r2 sin 28
+ ~,.3 sin 38 - + ... )
9. ~r2 sin 28 - ~r6 sin 68
11. ~7T2 - 4(,. cos 8 - ~r2 cos 28
+
,.3 sin 38 +
-1
13. -4 ( r sin 8 - -I
9
7T
+ ... )
~r3 cos 38 -
~
r5
+ ... )
sin 58 -
Problem Set 18.6, page 774
1.
7.
13.
15.
No; Izl2 is not analytic.
3. Use (2). F(~) = 2i
5. cJ:>(4, -4) = -12
Use (3). cJ:>(L 1) = -2 11. IF(ei 1)1 2 = 2 - 2 cos 'lA, e = 7T12, Max = 2
IF(z) I = [cos 2 2x + sinh2 2y]1I2, z = ±i, Max = [1 + sinh 2 2]112 = cosh 'l = 3.7622
No
Chapter 18 Review Questions and Problems, page 775
11.
13.
17.
23.
cJ:> = 10(1 - x + y), F = 10 - 10(1
(201ln 10) Ln ;::
Arg z = const
T(x. y) = x(2y + I) = const
27. F(;::)
= -
c
Ln (;:: - 5), Arg (;:: - 5)
+
i)z.
15. (101In 1O)(ln 100 - In r)
19. (-i/7T) Ln z
25. Circles (x - C)2 + )'2 = c 2
=
C
27T
29.20
+ -80 ( r sin
7T
e + -I ,.3 sin 3fJ + -I ,.5 sin 5fJ + ... )
3
5
Problem Set 19.1, page 786
1. 0.9817' 102, -0.1010' 103 ,0.5787' 10- 2, -0.1360' 105
3.0.36443/(17.862 - 17.798) = 0.36443/0.064 = 5.6942, 0.3644/(17.86 - 17.80) =
0.3644/0.06 = 6.073, 0.364/(17.9 - 17.8) = 3.64, impossible
5.
0.36443(17.862 + 17.798)
17.8622 _ 17.7982
=
0.36443' 35.660
12.996
319.05 _ 316.77 = ~ = 5.7000,
13.00
13.0
13
10
-22 = 5.702, ~2 = 5.70. = 5.7. . 8
_. 8
2.3
2
=
5
7. 19.95,0.049,0.05013: 20, 0, 0.05
9. In the present calculation, (b) is more accurate than (a).
11. -0.126' 10-2, -0.402' 10-3 ; -0.267' 10-6 , -0.847' 10- 7
13. Add first, then round.
15.
~
Q2
=
~I + EI
Q2
+
102
=
a
l
:
a2
EI
(1 _
~2 + ~2:
a2
a2
_
+ ...)
= ~I + ~l
a2
a2
_
~2 . ~1
a2
Q2
,
App. 2
A45
Answers to Odd-Numbered Problems
hence
l al)/I -all = lEII( -a - -:::::a2
a2
al
{/2
-
E21 ~ IErll + IEr21 ~ {3rl + {3r2
a2
19. (a) 19121 = 0.904761905, Echop = Eround = 0.1905.10- 5 ,
5
E,·.chop = Er.round = 0.2106.10- , etc.
Problem Set 19_2, page 796
1. g = 1.4 sin x, 1.37263 (= X5)
5. g = X4 + 0.2, 0.20165 (= x 3 )
7. 2.403 (= X5' exact to 3S)
9.0.904557 (= x 3 )
11. 1.834243 (= X4)
13. Xo = 4.5, X4 = 4.73004 (6S exact)
15. (a) 0.5, 0.375, 0.377968, 0.377964; (b) IIV7 = 0.377964 473
17. Xn+l = (2xn + 7/xn2 )/3, 1.912931 (= X3)
19. (a) Algorithm Bisect (f. ao_ bo, N) Bisection Method
This algorithm computes an interval [an, b1 J containing a solution of f(x) = 0
(f continuous) or it computes a solution Cn' given an initial interval lao, bol such
that f(ao)f(b o) < O. Here N is determined by (b - a)I2 N ~ {3, {3 the required
accuracy.
INPUT: Initial interval lao, boL maximum number of iterations N.
OUTPUT: Interval LaN, bNl containing a solution, or a solution Cn.
For 11 = 0, I, .. " N - I do:
Compute Cn = ~(an + b,J.
If f(c n ) = 0 then OUTPUT Cn. Stop. [Procedure completed]
Else continue.
If f(an)f(c n ) < 0 then an+l = an and bn +l = cn.
Else set {/n+l = Cn and b n + 1 = bn .
End
OUTPUT LUN' bN ]. Stop.
[Procedure completed]
End BISECT
Note that [aN' bNl gives (aN + bN )!2 as an approximation of the zero and (b N - aN)!2
as a cOlTesponding elTor bound.
(b) 0.739085; (c) 1.30980, 0.429494
21. 1.834243
23.0.904557
Problem Set 19.3, page 808
1. Lo(x) = -2.r + 19, L 1 (x) = 2x - 18, Pl(X) = 0.1082x + 1.2234,
Pl(9.4) = 2.2405
3.0.9971,0.9943.0.9915 (0.9916 4D), 0.9861 (0.9862 4D), 0.9835, 0.9809
5. P2(X) = -0.44304x 2 + 1.30906x - 0.02322, P2(0.75) = 0.70929
7. P2(X) = -0.1434x2 + 1.0895x, P2(0.5) = 0.5089, P2(1.5) = 1.3116
9. La = -t(x - I)(x - 2)(x - 3), Ll = !x(x - 2)(x - 3), ~ = -~x(x - 1)(x - 3),
~ = tx(x - I)(x - 2); P3(X) = 1 + 0.039740x - 0.335187x2 + 0.060645x 3;
P3(O·5) = 0.943654 (6S-exact 0.938470), P3(1.5) = 0.510116 (0.511828),
P3(2.5) = -0.047993 (-0.048384)
A46
App. 2
Answers to Odd-Numbered Problems
13. P2(X)
=
0.9461x - 0.2868x(x - 1)/2
= -0.1434x2 +
1.0895x
15.0.722,0.786
17. 8f1/2 = 0.057839, 8f3/2 = 0.069704. etc.
Problem Set 19.4, page 815
9. [-1.39(x - 5)2 + 0.58(x - 5)3]" = 0.004 at x = 5.8 (due to roundoff; should be 0).
11. 1 - ~X2 + ~X4
13.4 - 12x2 - 8x 3 ,4 - 12x2 + 8x 3 . Yes
15. I - x 2 , -2(x - l) - (x - 1)2 + 2(x - 1)3,
-1 + 2(x - 2) + 5(x - 2)2 - 6(x - 2)3
17. Curvature f"/(I + j'2)3/2 = f" if If'l is small
19. Use that the third derivative of a cubic polynomial is constant, so that gill is
piecewise constant, hence constant throughout under the present assumption. Now
integrate three times.
Problem Set 19.5, page 828
1. 0.747131
3. 0.69377 (5S-exact 0.69315)
7.0.894 (3S-exact 0.908)
5. 1.566 (4S-exact 1.557)
9.1h /2 + Eh/2 = 1.55963 - 0.00221 = 1.55742 (6S-exact 1.55741)
11. J h /2 + Eh/2 = 0.90491 + 0.00349 = 0.90840 (5S-exact 0.90842)
13. 0.94508, 0.94583 (5S-exact 0.94608)
15.0.94614588, 0.94608693 (8S-exact 0.94608307)
17. 0.946083 (6S-exact)
19. 0.9774586 (7S-exact 0.9774377)
21. x - 2 = t, 1.098609 (7S-exact 1.098612)
23. x = !U + 1),0.7468241268 (lOS-exact 0.7468241330)
25. (a) M2 = 2. M2* = ~, IKM21 = 2/(J2n 2). 11 = 183. (b) iv) = 24/x 5 , 2m = 14
27. 0.08, 0.32, 0.176, 0.256 (exact)
29.5(0.1040 - !. 0.1760 + i· 0.1344 - ~. 0.0384) = 0.256
r
Chapter 19 Review Questions and Problems, page 830
17.4.266,4.38, 6.0, impossible
19.49.980,0.020; 49.980, 0.020008
21. 17.5565 ~ s ~ 17.5675
23. The same as that of a.
25. -0.2, -0.20032, -0.200323
27.3,2.822785,2.801665,2.801386, VlOI386
29.2.95647.2.96087
31. 0.26, M2 = 6, M2* = 0, -0.02 ~ E ~ 0, 0.24 ~ a ~ 0.26
33. 1.001005, -0.001476 ~ E ~ 0
Problem Set 20.1, page 839
1. Xl = -2.4, X2 = 5.3
5. Xl = 2, X2 = 1
3. No solutlOn
7. Xl = 6.78, x2 = -11.3,
X3
= 15.82
App. 2
A47
Answers to Odd-Numbered Problems
9. Xl = 0, x 2 = T1 arbitrary, X3 = 5f1 + 10
11. Xl = fl , x 2 = t2, both arbitrary, X3 = 1.25fl
13. Xl = 1.5, X2 = -3.5, X3 = 4.5, X4 = -2.5
-
2.25t2
Problem Set 20.3, page 850
Exact 21.5, 0, -13.8
5. Exact 2, I, 4
7. Exact 0.5, 0.5, 0.5
(3
)T = [0.49982
0.50001 0.50002], (b) X (3)T = [0.50333 0.49985 0.49968]
6, 15, 46, 96 steps; spectral radius 0.09, 0.35, 0.72, 0.85, approximately
[1.99934 1.00043 3.99684]T (Jacobi, step 5): [2.00004 0.998059 4.00072jT
(Gauss-Seidel)
17. v'306 = 17.49, 12, 12 19. V 18k2 = 4.24Ikl, 41kl, 41kl
3.
9.
11.
13.
(a) X
A48
App. 2
Answers to Odd-Numbered Problems
Problem Set 20.4, page 858
1.
5.
7.
13.
17.
21.
[t -]
~]
-0.1
1]
12, v'62 = 7.87,6,
1.9, V1.35 = 1.16, 1. lO.3
6, \''6, 1. [1 1
K = 100· 100
46 ~ 6· 17 or 7 . 17
[-0.6 2.8]T
0.5
3.14, V56 = 7.07,4, [-1 I ~ -~]
1.0]
11. II AliI = 17, II A-I 111 = 17, K = 289
15. K = 1.2' 1~~ = 1.469
19. [0 11T, [1 -OAIT,289
23.27,748,28375,943656,29070279
Problem Set 20.5, page 862
1.
5.
11.
13.
3. 8.95 - 0.388x
-11.4 + 5Ax
s = -675 + 901, Vav = 90 kmlh
9.4 - 0.75x - 0.125x2
5.248 + 1.543x, 3.900 + 0.5321x + 2.021x2
-2.448 + 16.23x, -9.114 + 13.73x + 2.500x 2,
-2.270 + 1.466x -1.778x 2 + 2.852x 3
Problem Set 20.7, page 871
1. 5 ~ A ~ 9
3.5,0, 7; radii 4, 6, 6
5.IA - 4il ~ v'2 + 0.1. IAI ~ 0.1, IA - 9il ~ v'2
7.111 = 100,122 = 133 = I
9. They lie in the intervals with endpoints ajj :::t::: (11 - 1)10- 6 . (Why?)
11. 0 lies in no Gerschgorin disk, by (3) with >; hence det A = Al ... An
13. peA) ~ Row sum norm II A
IIx = max L
J
15.
Vi53
=
12.37
17.
lajkl = max (Iaiil
k
Vl22 = 11.05
=1=
O.
+ GerschgOlin radius)
J
19. 6 ~ A ~ 10. 8 ~ A ~ 8
Problem Set 20.S, page 875
1. q = 4,4.493.4.4999: lEI ~ 1.5.0.1849,0.0206
3. q = 8,8.1846,8.2252; lEI ~ L 0.4769, 0.2200
5. q = 4,4.786,4.917; lEI ~ 1.63,0.619,0.399
7. q = 5.5,5.5738.5.6018: lEI ~ 0.5. 0.3115, 0.1899: eigenvallle~ (4S) 1.697,3.382,
5.303.5.618
9. )' = Ax = Ax, yTx = AxTx, yTy = A2xTx,
E2 ~ '~/Y/XTX - (yTX/XTl\.)2 = A2 - A2 = 0
11. q = 1. ... , - 2.8993 approximates - 3 (0 of the given matrix),
lEI ~ 1.633, .. ',0.7024 (Step 8)
Problem Set 20.9, page 882
[
1.
3~
- ~.!lO2776
-UlO1776
6.730769
l.H-l6154
L~61~1 ~[
1.769230
0.9g0000
-0.-1--1-1814
-O.-1-·HS1-1-
0.H70l64
U37iOO:]
0
0.371803
0.4H9H36
App.2
A49
Answers to Odd-Numbered Problems
5. Eigenvalues 8, 3, 1
-2.50867
r 5.M516
01 r 7.45139
0.374953 , -1.56325
5.307219
-2.5086~
0.374953
r 7.91494
1.04762
0.0983071
1.00446
03124691
:1.000482
3.08458
0.0312469
r18.3171
IB3786
o0.360275 1 , r 0.396511
0.881767
~.881767
8.29042
0.360275
r18.3910
1.39250
0.396511
8.24727
0
0.0600924
:O6~241·
1.37414
0.177669
~.177669
9.
1
0098307
3.544142
-0.646602
-~.646602
7.
0
-1.56325
8.23540
:01lm141
0.0100214
1.37363
7an24
0.0571287
~.0571287
4.00088
r
7OO322
1 r7~n98
0.0249333 , 0.0326363
0.0326363
o
0.0249333
0.996875
4.U0034
0
0.00621221
:=21221
0.996681
0.0186419
~.0186419
r
4.00011
:.001547821
U.00154782
0.Y96669
Chapter 20 Review Questions and Problems, page 883
17.
r4
-I
2{
19.
L6
-3
I{
21. All nonzero entries of the factors are 1.
23.
2.8193
-1.5904
-1.5904
1.2048
-0.0482
-0.0241
[
27. 15,
,189,
8
=~:::~l
,/21, 4
31. 14,
,178, 7
37. 11.5 . 4.4578 = 51.2651
35.9
39. Y = 1.98
1
0.1205
29.7,
33. 6
25. Exact [-2
(4D-values)
+ 0.98x
41. Centers 1. 1, 1. radii 2.5. 1. 2.5 (A = 2.944.0.028 ::':: 0.290i, 3D)
43. Centers 5, 6, 8: radii 2,1, I, (A = 4.1864. 6.4707, 8.3429. 5S)
-2.23607
[
45.
15
-~.23607
o
5.8
-3.1
-3.1
6.7
J
,Step3:
[
9.M913
-1.06216
0
-1.06216
4.28682
-0.00308
-:00308J
0.26345
A50
App. 2
Answers to Odd-Numbered Problems
Problem Set 21.1, page 897
1. Y = eX, 0.0382, 0.1245 (elTor of X5, .\"10)
3. Y = x - tanh x (set y - x = til, 0.009292, 0.0188465 (elTor of .\"5, XIO)
5. y = eX, 0.001275, 0.004200 (elTor of X5' XIO)
7. Y = 11(1 - x 2 /2), 0.00029, 0.01187 (elTor of X5' xw)
9. y = 11(1 - x 212), 0.03547. 0.28715 (elTor of X5' XIO)
11. Y = 11(1 - x 2 /2); error -10- 8 , -4' 10-8 , . . " -6' 10-7 , + 10- 5;
about 1.3' 10-5 by (10)
13.y = xe x ; eITor'l~ (for x = L"" 3) 19,46,85,139,213,315,454,640,889,1219
15. Y = 3 cos x - 2 cos 2 x; error' 107 : 0.18, 0.74, 1.73,3.28,5.59,9.04, 14.33,22.77,
36.80, 61.42
17. Y = 1I(x5 + 1), 0.000307, -0.000259 (error of X5' XIO)
19. The elTors are for E.-c. 0.02000, 0.06287. 0.05076. for Improved E.-C. -0.000455,
0.012086,0.009601, for RK 0.0000012,0.000016,0.000536.
Problem Set 21.2, page 901
3. y = e- O. 1x2 ; elTors 10-6 to 6· 10-6
5. y = tan x; .\'4' .. " )'10 (error' 105): 0.422798 (-0.48),0.546315 (-1.2),0.684161
(-2.4),0.842332 (-4.4),1.029714 (-7.5),1.260288 (-13),1.557626 (-22)
7. RK-elTor smaller. elTor' 105 = 0.4, 0.3, 0.2. 5.6 (for x = 0.-1-, 0.6, 0.8, 1.0)
9• .\'4 = 4.229690, Y5 = 4.556 859, )'6 = 5.360657. Y7 = 8.082 563
11. ElTors between -6 . 10-7 and + 3 . 10- 7 . Solution eX - x - I
13. Errors' 105 from x = 0.3 to 0.7: -5, -II, -19, -31, -47
15. (a) 0, 0.02, 0.0884, 0.215 848,)'4 = 0.417818'.\'5 = 0.708887 (poor).
(b) By 30-50%
Problem Set 21.3, page 908
3. )'1 = eX, )'2 = -ex. elTors range from ±0.02 to ±0.23, monotone.
5. )'~ = Y2, )'~ = -4)'1' )' = Yl = I, 0.84, 0.52, 0.0656, -0.4720; y = cos 2x
x
x
7')'1 = 4e- sin x')'2 = 4e- cos x; elTors from 0 to about ±O.I
9. ElTors smaller by about a factor 104
11. Y = 0.198669,0.389494,0.565220,0.719632,0.847790;
y' = 0.980132,0.922062,0.830020,0.709991,0.568572
13. Y1 = e- 3 .1: - e- 5 :r')'2 = e- 3 .1: + e- 5 .1:;)'1 = 0.1341. 0.1807, 0.1832, 0.1657,
0.1409;)'2 = 1.348,0.9170.0.6300,0.4368.0.3054
17. You gel the exact solution, except for a roundoff elTor [e.g., Yl = 2.761 608,
y(0.2) = 2.7616 (exact), etc.]. Why?
19. Y = 0.198669,0.389494,0.565220,0.719631. 0.847789;
y' = 0.980132,0.922061. 0.830019, 0.709988, 0.568568
Problem Set 21.4, page 916
3. 105, 155, 105, 115; Step 5: 104.94, 154.97, 104.97, 114.98
5.0.108253, 0.108253,0.324760,0.324760; Step 10: 0.108538, 0.108396, 0.324902,
0.324831
App. 2
A51
Answers to Odd-Numbered Problems
7. 0, O. O. O. All equipotentia11ines meet at the comers (why?). Step 5: 0.29298.
0.14649,0.14649.0.073245
9. - 3un + U12 = -200, Un - 3U12 = -100
11. U12 = U32 = 31.25, U21 = U23 = 18.75, ujk = 25 at the others
13. U21 = U23 = 0.25, U12 = U32 = -0.25, Ujk = 0 else
15. (a) Un = -U12 = -66. (b) Reduce to 4 equations by symmetry.
Un = U31 = -U15 = - 1I35 = -92.92, U21 = -U25 = -87.45,
U12 = U 32 = -U 14 = -U 34 = -64.22, U 2 2 = -U24 = -53.98,
U13
17.
=
U23
=
U33 =
0
\13,
Un = U21 = 0.0849, U12 = U22 = 0.3170. (0.1083, 0.3248 are 4S-values of
the solution of the linear system of the problem.)
Problem Set 21.5, page 921
5. Un = 0.766. U21 = 1.109. U12 = 1.957. U22 = 3.293
7. A as in Example I, right sides -2, -2, -2, -2. Solution Un = U21 = 1.14286,
U 1 2 = U22 = 1.42857
11. -4un + U21 + U12 = - 3. Un - 4U21 + U22 = -12, Un - 4U12 + U22 = 0,
2U21 + 2U12 - 12u22 = - 14. Un = U22 = 2, U21 = 4. U12 = 1. Here
-14/3 = -~(1 + 2.5) with 4/3 from the stencil.
13. b = [-380 -190, -190, O]T; Un = 140, U21 = U 1 2 = 90, U22 = 30
Problem Set 21.6, page 927
5.0.1636.0.2545
(t
= 0.04. x = 0.2,0.4).0.1074.0.1752 (t = 0.08),0.0735.0.1187
(t = 0.12),0.0498,0.0807 (t
= 0.16),0.0339,0.0548
(t
= 0.2; exact 0.0331,0.0535)
7. Substantially less accurate, 0.15, 0.25 (1 = 0.04),0.100,0.163 (t = 0.08)
9. Step 5 gives 0,0.06279,0.09336,0.08364,0.04707, O.
11. Step 2: 0 (exact 0),0.0453 (0.0422),0.0672 (0.0658), 0.0671 (0.0628),0.0394
(0.0373), 0 (0)
13.0.1018,0.1673,0.1673,0.1018 (t = 0.04),0.0219,0.0355, ... (t = 0.20)
15.0.3301,0.5706.0.4522.0.2380 (t = 0.04).0.06538.0.10604,0.10565.0.6543
(t = 0.20)
Problem Set 21.7, page 930
1. For x = 0.2, 0.4 we obtain 0.012, 0.02 (t = 0.2), 0.004, 0.008 (t = 0.4), -0.004,
-0.008 (t = 0.6). etc.
3. u(x, 1) = 0, -0.05, -0.10, -0.15, -0.075,0
5.0.190,0.308,0.308,0.190 (0.178, 0.288, 0.288, 0.178 exact to 3D)
7.0,0.354.0.766, 1.271, 1.679. 1.834.... (t = 0.1); 0.0.575.0.935, 1.135, 1.296.
1.357, ... (t = 0.2)
Chapter 21 Review Questions and Problems, page 930
17.
y = tan x; 0 (0),0.10050 (-0.00017). 0.20304 (-0.00033), 0.30981 (-0.00047),
0.42341 (-0.00062), 0.54702 (-0.00072)
ASl
App. 2
Answers to Odd-Numbered Problems
19. 0.1 003349 (0.8 . 10-7 ) 0.2027099 (I.6 . 10- 7 ), 0.3093360 (2.1 . 10-7 ). 0.4227930
(2.3· 10-7 ),0.5463023 (1.8' 10-7 )
25.y(0.4) = 1.822798,.\'(0.5) = 2.046315,)'(0.6) = 2.284161,)'(0.7) = 2.542332,
y(0.8) = 2.829714, y(0.9) = 3.160288, .v(1.0) = 3.557626
27'.\"1 = 3e- 9x,.I"2 = -5e- 9:r, [1.23251 -2.05419J, [0.506362 -0.843937],···,
[0.035113 -0.058522]
29. 1.96, 7.86, 29.46
31. II(P l l ) = 1I(P31) = 270. U(P21 ) = U(P13 ) = U(P23 ) = U(P33 ) = 30,
U(P 12 ) = U(P32 ) = 90, 1I(P22 ) = 60
35.0.06279,0.09336,0.08364,0.04707
37.0, -0.352, -0.153,0.153,0.352,0 if t = 0.12 and 0,0.344,0.166, -0.166,
-0.344, 0 if t = 0.24
39.0.010956.0.017720.0.017747,0.010964 if t = 0.2
Problem Set 22.1, page 939
3. f = 3(-"1 - 2)2 + 2(X2 + 4)2 - 44. Step 3: [2.0055 - 3.9Y75]T
5. f = 0.5(x1 - 1)2 + 0.7(X2 + 3)2 - 5.8, Step 3: rO.99406 -3.0015]T
7. f = 0.2(X1 - 0.2)2 + X2 2 - 0.008. Step 3: [0.493 -O.Oll]T,
Step 6: [0.203 0.004]T
Problem Set 22.2, page 943
1. X3, X4 unused time on MI' M 2. respectively
3. No
11. fmax = f(O. 5) = 10
13. f max = f(9, 6) = 36
15. fmin = f(3.5, 2.5) = - 30
17. X1/3 + x2/2 ~ 100, x1/3 + -"2/6 ~ 80, f = 150X1 +
fmax = f(210, 60) = 37500
19. 0.5X1 + 0.75x2 ~ 45 (copper), 0.5X1 + 0.25x2 ~ 30, f = 120x1 + 100x2,
fmax = f(45, 30) = 8400
Problem Set 22.3, page 946
1. f(12011 I, flOIl 1) = 48011 I
2100 200)
3. f ( - 3 - ' 2/3
= 78000
5. Matrices with Rows 2 and 3 and Columns 4 and 5 interchanged
7. f(O, [0) = -10
9. f(5, 4, 6) = 478
Problem Set 22.4, page 952
1. f(-l-. -1-) = 72
7. f(l, 1, 0) = 12
3. f(10, 30) = 50
f(!, 0, ~) = 3
5. f(lO, 5) = 5500
9.
Chapter 22 Review Questions and Problems, page 952
n. Step 5: [0.353 -0.028]T. Slower
13. Of course! Step 5: [-1.003 1.897]T
21. f(2, -1-) = 100
23. f(3, 6) = -54
25. /(50, 100) = 150
App. 2
Answers to Odd-Numbered Problems
A53
Problem Set 23.1, page 958
o
o
9.
0
o
o
1
15.
0
o
o
o
o
0
CD-------®
11.
1
f:
0
15'm
3
]
0
0
0
0
0
0
0
0
0
0
13.
0
4
Edge
o
21.
><
2
>
3
~<ll
0
25.
0
E
23.
0
4
1
><
<ll
>
0
2
3
0
Vertex
1
2
3
4
Incident Edges
-eh -e2, e3, -e4
el
e2, -e3
e4
Problem Set 23.2, page 962
1.4
5.4
3.5
9. The idea is to go backward. There is a VIc-I adjacent to Vk and labeled k - 1, etc.
Now the only vertex labeled 0 is s. Hence A(vo) = 0 implies Vo = s, so that
Vo - VI - ... - Vk-l - Vic is a path s ~ Vic that has length k.
15. No; there is no way of traveling along (3, 4) only once.
21. From Tn to 100m, 10m, 2.5m, 111 + 4.6
Problem Set 23.3, page 966
1.
3.
5.
7.
(1. 2).
(1, 2),
(1,4),
(1, 5),
(2,4).
(I, 4),
(2, 4),
(2. 3),
(4,
(2,
(3,
(2.
3);
3);
4),
6),
L2 = 6. L3 = 18. L4 = 14
L2 = 2, L3 = 5, L4 = 5
(3, 5); L2 = 4, L3 = 3, L4 = 2, L5 = 8
(3. 4), (3, 5); ~ = 9, ~ = 7, L4 = 8, L5 = 4, L6 = 14
Problem Set 23.4, page 969
2
1. 1 :,
/
3
L
=
12
I
/
3. 4 ,\""2
3- 5
4"\
5
L
=
10
8
/
5. 1 - 2"\ 5
3~
6- 4
"\
7
L
= 28
A54
App. 2
Answers to Odd-Numbered Problems
2
/
L = 38
11. Yes
5-6,
15. G is connected. If G were not a tree, it would have a cycle, but this cycle would
provide two paths between any pair of its vertices, contradicting the uniqueness.
19. If we add an edge (u, u) to T, then since T is connected, there is a path U ~ u in T
which. together with (II, u), forms a cycle.
9. I - 3 - 4 '\.
Problem Set 23.5, page 972
1. (I, 2), (1.4), (3, 4), (4,5). L = 12
3. (I. 2). (2, 8), (8, 7), (8. 6), (6, 5), (2, 4), (4, 3), L
5. (1,4), (3,4). (2,4). (3,5), L = 20
= 40
7. (I, 2), (I, 3), (I, 4), (2, 6), (3, 5), L = 32
11. If G is a tree
13. A shortest spanning tree of the largest connected graph that contains vertex 1
Problem Set 23.6, page 978
1.
3.
5.
7.
I - 2 - 5, Ilf = 2; 1 - 4 - 2 - 5, Ilf = 2, etc.
I - 2 - 4 - 6 . .1f = 2; I - 2 - 3 - 5 - 6. Ilf = I, etc.
f12 = 4, f13 = 1. f14 = 4. f42 = 4. f43 = 0,125 = 8, f35 = 1, f = 9
f12
= 4. f13 = 3, f24 = 4, f35 = 3, f54 = 2, f4fj = 6, f56 = 1, f = 7
~ {~5,6},28
11. {2,~ 6},50
13. I - 2 - 3 - 7, lJ.f = 2; I - 4 - 5 - 6 - 7, Ilf = 1;
1 - 2 - 3 - 6 - 7, Ilf = 1; fmax = 14
15. {3, 5, 7}. 22
17. S = {I, 41. cap (S.
19. If fii < Cij as well as fii > 0
n=
6 + 8
= 14
Problem Set 23.7, page 982
3.
5.
7.
9.
(2, 3) and (5. 6)
1 - 2 - 5, .:It = 2; 1 - 4 - 2 - 5, ~t = I; f = 6 + 2 + 1 = 9
1 - 2 - 4 - 6, .:It = 2: 1 - 3 - 5 - 6. Il t = 1; f = 4 + 2 + I = 7
By considering only edges with one labeled end and one unlabeled end
17. S = {I, 2,4, 51. T = {3, 6}, cap (S, n = 14
Problem Set 23.8, page 986
1. No
3. No
5. Yes, S = { I, 4, 5, 8}
7. Yes; a graph is not bipartite if it has a nonbipartite subgraph.
9.1 - 2 - 3 - 5
11. (1, 5), (2, 3) by inspection. The augmenting path I - 2 - 3 - 5
gives I - 2 - 3 - 5, that is, (\, 2), (3, 5).
13. (1,4), (2. 3). (5, 7) by inspection. Or (1, 2), (3, 4), (5, 7) by the use of the path
1 - 2 - 3 - 4.
15. 3
25. K3
19. 3
23. No; K5 is not planar.
App. 2
ASS
Answers to Odd-Numbered Problems
Chapter 23 Review Questions and Problems, page 987
0
0
13.
r~
1
]
0
0
0
0
15.
0
0
0
0
o
o
o
CD---0
o
o
o
o
17.
21.
o
1
19.\
o
o
CD
o
23.4
Incident Edges
Ve11ex
/
e2, -e3
2
3
-eb e3
'eb -e2
25.4
29. I - 4 - 3 - 2, L
=
27. L2 = 10. L3
33. f = 7
16
=
15, L4
=
13
Problem Set 24.1, page 996
1. qL
19, qM = 20, qu = 20.5
5. qL = 69.7, qM = 70.5, qu = 71.2
9. qL = 399, qM = 401, qu = 401
13..r = 70.49, s = 1.047,IQR = 1.5
17. 0 0 300
=
3.qL = 38,QM = 44,Qu = 54
7. qL = 2.3, qM = 2.4, qu = 2.45
11. x = 19.875, s = 0.835, IQR = 1.5
15. x = 400.4, s = 1.618, IQR = 2
19. 3.54, 1.29
Problem Set 24.2, page 999
1. 4 outcomes: HH, HT, TH, TT (H = Head, T = Tail)
3.62 = 36 outcomes (1, 1), (1, 2), .. " (6, 6)
5. Infinitely many outcomes S, SCS, ScScS, ... (S = "Six")
7. The space of ordered triples of nonnegative numbers
9. The space of ordered pairs of numbers
11. Yes
13. E = IS, scs, SCSCS}, E" = {SCSCSCS, ScScScScS, ... } (S = "Six")
Problem Set 24.3, page 1005
1. (a) 0.9 3 = 72.9%, (b) 190~ • ~~. ~~ = 72.65%
3 490. 489 • 488 • 487 • 486 - 90 3501.
• 500
499
498
497
496 -
5. 1 - 2~ = 0.96
9. P(MMM)
+
P(MMFM)
.
/0
7. I - 0.75 2 = 0.4375 < 0.5
+ P(MFMM) + P(FMMM) =
~
+ 3 . 1~
=
1~
A56
App. 2
Answers to Odd-Numbered Problems
11. 3~ + ~~ - 3~ = ~~ by Theorem 3, or by counting outcomes
13. 0.08 + 0.04 - 0.08 . 0.04 = 11.68%
17. 1 - 0.97 4 = 11.5%
15.0.95 4 = 81.5%
Problem Set 24.4, page 1010
5. (2~)
3. In 40320 ways
7.210,70. 112.28
11. (~~) = 635013559600
15.676000
= 1140
= 1260. AIlS. 111260
9. 9!/(2!3!4!)
13. 1184, 5121
Problem Set 24.5, page 1015
1. k = 1/55 by (6)
3. k = II8 by (10)
7. 1 - P(X ~ 3) = 0.5
5. No because of (6)
9. P(X
> 1200)
f
=
2
6[0.25 -
(x -
1.5)2] dx = 0.896.
AilS.
0.8963 = 72%
1.2
11. k
17. X
= 2.5; 50%
> b, X ~ b.
13. k = 1.1565; 26.9%
X < c. X
~
c. etc.
Problem Set 24.6, page 1019
1. 2/3, 1118
3.3.5.2.917
7. $643.50
5.4, 16/3
9. JL = lie = 25; P
13. 750, 1, 0.002
=
20.2%
11.~, 2~'
(X -
~)V20
15. 15c - 500c 3 = 0.97. c = O.ms55
Problem Set 24.7, page 1025
1. 0.0625, 0.25, 0.9375, 0.9375
3.64%
5.0.265
7. f(x) = OS"e- o.5 /x!, f(O) + f(l) = e- O.5 ( 1.0 + 0.5) = 0.91. Am. 9%
9. I - e- O.2 = 18%
11. 0.99100 = 36.6%
13. ~~~, ~~~, 2~~' :di6
Problem Set 24.8, page 1031
1. 0.1587, 0.6306, 0.5, 0.4950
5.16%
9. About 23
13. t = 1084 hours
3. 17.29, 10.71, 19.152
7. 31.1 %, 95.5%
11. About 58st
Problem Set 24.9, page 1040
1. 1/8, 3/16, 3/8
3. 2/9, 2/9, 1/2
5. f2(Y) = 11(/32 - ll'2) if ll'2 < Y < /32 and 0 elsewhere
7. 27.45 mm, 0.38 mm
9. 25.26 cm, 0.0078 cm
App. 2
Answers to Odd-Numbered Problems
13. lndependent, .f1(X)
15. 50lJf
A57
= O.le- O.lx if X> 0,
f2(Y)
= O.le- Oly if Y > 0, 36.8%
17. No
Chapter 24 Review Questions and Problems, page 1041
23. x = 22.89, s = 1.028.
21. QL = 22.3, QM = 23.3, Qu = 23.5
25. H, TH, TTH, etc.
27. f(O) = 0.80816. f(l) = 0.18367. f(2) = 0.00816
29. Always B !: A U B. If also A !: B, then B = A U B, etc.
31. 7/3, 8/9
33. 118.019, 1.98, 1.65%
35.0, 2
37. JL = 100/30
39. 16%, 2.3% (see Fig. 520 in Sec. 24.8)
S2
= 1.056
Problem Set 25.2, page 1048
3. 1 = pk(1 - p)n-k,
5. 11120
7. 1 = f(x), aOn l)/ap
9. it = x
13. = 1
p = kin, k = number of Sllccesses in n trials
= lip -
e
p
= 0, = 1/x
11. = nl'i. Xj = l!x
15. Variability larger than perhaps expected
(x - 1)10 - p)
e
Problem Set 25.3, page 1057
1. CONFo.95 [37.47 ~ JL ~ 43.87}
3. Shorter by a factor v'2
5.4, 16
7. Cf. Example 2. n = 166
11. CONFo.99 [63.71 ::::; JL ~ 66.29}
9. CONFo.99 [20.07 ~ JL ~ 20.33}
13. c = 1.96, x = 87, S2 = 87' 413/500 = 71.86, k = cslVn = 0.743,
CONFo.95 {86 ~ JL ~ 88}, CONF095 (0.17 ~ p ~ 0.18}
15. CONFo.95 (0.00045 ~ a 2 ~ 0.00l31} 17. CONFO.95 [0.73 ::::; a 2 ~ 5.19}
19. CONFO.95 [23 ~ a 2 ~ 548}. Hence a larger sample would be desirable.
21. Normal distributions, means -27,81, 133, variances 16, 144,400
23. Z = X + Y is normal with mean 105 and variance 1.25.
Ans. P(l04 ~ Z ~ 106) = 63%
Problem Set 25.4, page 1067
t = V7(0.286 - 0)/4.31 = 0.18 < c = 1.94; do not reject the hypothesis.
c = 6090 > 6019; do nol reject the hypothesis.
ifln = I, c = 28.36: do not reject the hypothesis.
JL < 28.76 or JL > 31.24
Alternative JL =1= 1000, t = v'2O (996 - 1000)/5 = -3.58 < c = - 2.09 (Table
A9, 19 degrees of freedom). Reject the hypothesis JL = 1000 g.
11. Test JL = 0 against JL =1= O. t = 2.11 < c = 2.36 (7 degrees of freedom). Do not
reject the hypothesis.
13. ll' = 5%, c = 16.92 > 9.0.5 2 /0.4 2 = 14.06; do not reject hypothesis.
15. to = \1'10' 9·17119 (21.8 - 20.2)1\1'9.0.62 + 8.0.5 2 = 6.3 > c = 1.74
(17 degrees of freedom). Reject the hypothesis and assert that B is better.
1.
3.
5.
7.
9.
ASS
App. 2
17.
Answers to Odd-Numbered Problems
= 50/30 = 1.67 < c = 2.59 [(9, 15) degrees of freedom]: do not reject the
hypothesis.
Vo
Problem Set 25.5, page 1071
1. LCL = I - 2.58 . 0.03/v6 = 0.968, VCL = 1.032
3.11 = 10
5. Choose 4 times the original sample size (why?).
7. 2.58VO.024/\12 = 0.283. VCL = 27.783, LCL = 27.217
11. In 30% (5%) of the cases, approximately
13. VCL = Ill' + 3Vllp(\ - p), CL = Ill', LCL = "l' - 3Vl1p(1 - p)
15. CL = JL = 2.5, VCL = JL + 3~ = 7.2, LCL = JL - 3~ is negative in (b) and
we set LCL = O.
Problem Set 25.6, page 1076
1. 0.9825, 0.9384, 0.4060
5. peA; e) = e- 30H(1 + 308)
7. P(A; 8) = e- 50 (J
11. (l - 8)5, (l - 8)5 + 5e(l - 8)4
15. <1>«9 - 12 + ~)/V12(1 - 0.12)) =
17. (l - ~)3 + 3 . ~(l - ~)2 = ~
3.0.8187,0.6703.0.1353
9. 19.5%, 14.7%
13. Because /I is finite
0.22 (if c = 9)
Problem Set 25.7, page 1079
1. X02 = (30 - 50)2/50 + (70 - 50)2150 = 16 > c = 3.84: no
3.41
5. X02 = 2.33 < c = 11.07. Yes
7. ej = I1Pj = 370/5 = 74. X02 = 984174 = 13.3. c = 9.49. Reject the hypothesis.
9. X02 = I < 3.84; yes
13. Combining the results for x = lo, II, 12, we have K - r - 1 = 9 (r = I since we
estimated the mean. 1~ci:f = 3.87). Xo2= 12.98 < c = 16.92. Do not reject.
15. X02 = 49/20 + 49/60 = 3.27 < c = 3.84 (1 degree of freedom, a = 5%), which
supports the claim.
17. 42 even digits, accept.
Problem Set 25.8, page 1082
3. (~l8(l + 18 + 153 + 816) = 0.0038
5. Hypothesis: A and B are equally good. Then the probability of at least 7 trials
favorable to A is ~8 + 8 . ~8 = 3.5%. Reject the hypothesis.
7. Hypothesis JL = O. Alternative JL > 0, .r = 1.58,
t = \liD. 1.58/1.23 = 4.06 > c = 1.83 (a = 5%). Hypothesis rejected.
9. x = 9.67, s = 11.87, to = 9.67/(11.871'\115) = 3.15 > c = 1.76 (a = 5%).
Hypothesis rejected.
11. Consider .'J = Xj - Po.
13. peT ~ 2) = 0.1% from Table A12. Reject.
App. 2
A59
Answers to Odd-Numbered Problems
15. P(T ~ 15) = 10.8%. Do not reject.
17. P{T ~ 2) = 2.8%. Reject.
Problem Set 25.9, page 1091
1. Y = 1.9 + x
3. y = 6.7407 + 3.068x
5. y = 4 + 4.8x. 172 ft
7. y = -1146 + 4.32x
9. y = 0.5932 + 0.1l38x, R = 1/0.1138
11. qo = 76, K = 2.36V76/(7· 944) = 0.253, CONFo.95 { -1.58 ~ Kl ~ -1.06}
13. 3sx 2 = 500, 3sxy = 33.5, kl = 0.067, 3s y2 = 2.268. qo = 0.023. K = 0.02]
CONFo.95 {0.046 ~ Kl ~ 0.088}
Chapter 25 Review Questions and Problems, page 1092
i/
21. fl = 5.33.
= 1.722
25. CONFo.99 { 19.1 ~ J.L ~ 33.7}
29. CONFo.95 { 1.373 ~ J.L ~ 1.4511
33. c
= 14.74 > 14.5; reject J.Lo.
23. It will double.
27. CONFo.95 {0.726 ~ J.L ~ 0.75]}
31. CONFo.99 {0.05 ~ u 2 ~ 10}
14.74 - 14.40)
35. cD (
•~
= 0.9842
v 0.025
37.30.14/3.8 = 7.93 < 8.25. Reject.
39. Vo = 2.5 < 6.0 [(9.4) degrees of freedom]; accept the hypothesis.
41. Decrease by a factor v'2. By a factor 2.58/1.96 = 1.32.
43.0.9953,0.9825,0.9384, etc.
45. y = 1.70 + 0.55x
;;~".·APPENDIX
II
p
',.. j
---- ....
3
n"
Auxiliary Material
A3.1
Formulas for Special Functions
For tables of IlUllleric values. see Appelldix 5.
Exponential function
e
eX
(Fig. 544)
= 2.71828 1828459045235360287471353
(1)
Natural logarithm (Fig. 545)
In (xy) = In x
(2)
+ In y.
In (xly) = Inx - 1ny.
In x is the inverse of eX, and e ln x = x, e- ln x
= e1n nIx)
= IIx.
Logarithm of base ten 10glOx or simply log x
(3)
log x = M In x,
(4) In x
=
I
X
M loa
b"
M
I
M
= log e
=
0.434294481903251 82765 11289 18917
= In 10 = 2.30258509299404568401 7991454684
log x is the inverse of lOT, and I Olog x
= x,
I O-log X
=
IIx.
Sine and cosine functions (Figs. 546.547). In calculus, angles are measured in radians,
so that sin x and cos x have period 27['.
sin .1' is odd. sin (-x) = - sin x, and cos x is even. cos (-x) = cos x.
y
I
y
5
2
o
x
Fig. 544.
A60
Exponential function eX
I
x
-2,
Fig. 545.
Natural logarithm In x
SEC. A3.1
Formulas for Special Functions
A61
y
y
/
x
/
~l
\. X
sin x
Fig. 546.
Fig. 547.
1°
=
0.01745 32925 19943 radian
radian
=
57° 17' 44.80625"
cos x
= 57.29577 95131 °
sin 2 x
(5)
sin (x
+ y) =
sin (x - y)
(6)
cos (x
{
sin (7T
(9)
cos 2
(10)
X
=
{
cos x cos y - sin x sin y
= cos x cos y + sin x sin y
cos 2x
x) = sinx.
-
~(1
cos
= cos2 X
-
sin2 x
=
~[-cos (x
cos x cos y
=
~[cos (x
sin x cos y
=
Msin (x
cos u
+
sin v
x) = -cosx
(7T -
sin 2 x
+ cos 2x),
=
+ y) +
+ y) +
+ y) +
~(l - cos 2x)
cos (x - y)]
em (x - y)]
sin (x - y)]
u+v
u-v
2
2
u+v
u-v
u+v
u-v
2
2
= 2 sin - - - cos - - -
+ cos v = 2 cos --2- cos --2-
cos v - cos u
(14)
- cos x sin y
x= cos (x - ;) = cos ( ; - x)
cos x= sin (x + ;) = sin ( ; - x)
sin u
(12)
sin y
sin
sin x sin y
(1 L)
+ cus x
= sin x cos y
sin 2x = 2 sin x cos x,
(8)
(13)
cos 2 X = 1
sin x cos y
+ y) =
cos (x - y)
(7)
+
A cos x
+B
sin x =
VA
2
A cos x
+B
sin x =
VA
2
= 2 sin - - - sin - - -
+ B2 cos (x ±
+
B2 sin (x
0),
± 8),
tan 8 =
tan 8
=
sin 8
B
+-
cos 8
sin 8
eas 8
A
=
A
-T-
B
A62
APP. 3
Auxiliary Material
y
y
5
5
)
)
-Tr
/
Tr
-Tr
x
( (
\
-5
\
-5
tan x
Fig. 548.
cot x
Fig. 549.
Tangent, cotangent, secant, cosecant (Figs. 548, 549)
(15) tanx
(16)
=
sinx
cosx
tan (x
+ y)
=
cot x
=
tan x
+
cosx
sec x
sinx
tany
=
tan (x - y) =
1 - tanxtany
cscx
cos x
=
sin x
tan x - tan y
+
1
tan x tan v
Hyperbolic functions (hyperbolic sine sinh x, etc.; Figs. 550, 551)
(17)
tanh x
(18)
cosh x
(19)
+
=
sinh x
coshx '
sinh x
sinh2 x
(21)
=
i(cosh 2x - 1),
=
sinh x
cosh x - sinh x
= eX,
COSh2 X
(20)
cosh x
coth x
-
sinh2 x
= I
COSh2 X
=
i(cosh 2x
y
y
4
4
2
-2
g. 550.
/
2 x
;-2
-4
-4
Fig. 551.
+
l)
\
2 x
-2
-2
sinh x (dashed) and cosh x
e- x
=
tanh x (dashed) and coth x
SEC A3.1
A63
Formulas for Special Functions
{
(22)
sinh (x ± y)
sinh x cosh y ± cosh x sinh y
=
cosh (x ± y) = cosh x cosh y ± sinh x sinh y
tanh (x ± y)
(23)
tanh x ± tanh y
= ------I ± tanh x tanh y
Gamma function (Fig. 552 and Table A2 in App. 5). The gamma function f(a) is defined
by the integral
(24)
(a> 0)
which is meaningful only if a> 0 (or, if we consider complex a, for those a whose real
part is positive). Integration by parts gives the importantfullctional relatio1l of the gamma
function,
(25)
f(a
+
1)
= af(a).
From (24) we readily have r(l) = I: hence if a is a positive integer. say k. then by
repeated application of (25) we obtain
+
r(k
(26)
1)
=
(k = 0, 1, .. ').
k!
This shows that the ga11l111afunction can be regarded as a generalization of the elementary
factorial jilllction. [Sometimes the notation (a - L)! is used for rea), even for noninteger
values of a, and the gamma function is also known as the factorial function.]
By repeated application of (25) we obtain
f(a)=
rea +
1)
f(a
a(a
+
+
2)
f(a + k + 1)
+ I )(a + 2) ... (a +
----'--
a(a
1)
nw
4
I
I
I
I
I
I
I
I
-2
in
Fig. 552.
-4
Gamma function
ex
k)
A64
APP. 3
Auxiliary Material
and we may use this relation
rea)
(27)
rea + k + I)
= -------a(a + I) ... (a + k)
(a oF 0, -1, -2,· .. )
for defining the gamma function for negative a (oF -1, -2, " .), choosing for k the
smallest integer such that a + k + I > O. Together with (24), this then gives a definition
of rca) for all a not equal to zero or a negative integer (Fig. 552).
It can be shown that the gamma function may also be represented as the limit of a
product namely. by the formula
(28)
rea)
n! nl>
=
lim
n~!XJ
(
a a
+ I )(a + 2) ...
(a
(a oF 0, - 1, .. ').
+ 11)
From (27) or (28) we see that, for complex a, the gamma function r( a) is a meromurphic
function with simple poles at a = 0, - 1, - 2, ....
An approximation of the gamma function for large positive a is given by the Stirling
formula
(29)
where e is the base of the natural logarithm. We finally mention the special value
(30)
Incomplete gamma functions
Q(a. x)
(31)
= f=e-tt
U
-
1
dt
(a> 0)
x
(32)
rca)
=
pea, x)
+
Q(a. x)
Beta function
(33)
B(x. y) =
I
1
1 (] -
tX-
t)y-l
(x> 0, y > 0)
dt
o
Representation in terms of gan1ma functions:
(34)
B(x. y)
=
f(x)f()')
rex. + .y)
Error function (Fig. 553 and Table A4 in App. 5)
(35)
erf x = -2-
v:;;:
IXe-
t2
dt
0
7
(36)
x
3!7
+ _ ... )
SEC. A3.1
A65
Formulas for Special Functions
erfx
1
0.5
-2
x
/
~
/0.5
-1
Error function
Fig. 553.
erf (x)
1, C017lple171CllTal}, error jill1ction
=
erfc x = I - erf x
(37)
=
~ 2I
V'Tr
I""e -
t
2
dt
x
Fresnel integrals! (Fig. 554)
x
C(x) = {cos (t
o
(38)
C(x)
=
-v:;;;s, S(X)
= vi'Tr/S,
2
)
Set)
dt,
=
Jo sin (t2) dt
co171plemelllary fimctiollS
!¥ -
c(x)
=
s(x)
=
C(x)
(39)
LXcos (t
=
r; - to
S(x) =
\
S
2
)
dt
sin (12) dt
x
Sine integral (Fig. 555 and Table A4 in App. 5)
Sitx)
(40)
=
x
J
o
sin t
-~
t
dt
y
1
C(x)
/
-' '- _/_' ~----4'~
~
'-_/
/
o
-
~.//--
x
Fig. 554,
Fresnel integrals
lAUGUSTIN FRESNEL (1788-1827), French physicist and mathematician. For tables ~ee Ref. [GRI].
A66
APP. 3
Auxiliary Material
Si(X~l
,,/2
-
1
O~'--~I---L--~--L-~5~-L--~--~~L-~1~~~~x
Fig. 555.
Si(:x:)
=
Sine integral
7T/2. ("(}17lplemento ry jimction
(41)
si(x)
=
7T
-
f
Si(x) =
-
2
x
.
SIll
t
- - dt
t
x
Cosine integral (Table A4 in App. 5)
(42)
citx}
f
=
cc
cos t
--
dt
(x> 0)
dt
(x> 0)
t
x
Exponential integral
(43)
f
Ei(x) =
x
-t
~
t
x
Logarithmic integral
(44)
A3.2
lie\")
=
x
Io
-
dt
In t
Partial Derivatives
For differentiation formulas, see inside of front cover.
Let ~ = f(x, y) be a real function of two independent real variables, x and y. If we keep
y constant. say, y = )'1' and think of x as a variable, then f(x, )'1) depends on x alone. If
the derivative of f(x, Yl) with respect to x for a value x = Xl exists. then the value of this
derivative is called the partial derivative of f(x. \") u'ith respect to x at tbe point (Xl' .'"1)
and is denoted by
or by
Other notations are
and
these may be used when subscripts are not used for another purpose and there is no danger
of confusion.
SEC. A3.2
A67
Partial Derivatives
We thus have, by the definition of the derivative,
(1)
The partial delivative of.:
x constant, say, equal to
Xl,
= f(x. y) with respect to y is defined similarly; we now keep
and differentiate f(XI, y) with respect to y. Thus
(2)
Other notations are fy(x}. YI) and ~y(XI' )'1)'
It is clear that the values of those two partial derivatives will in general depend on the
point (Xl, YI)' Hence the partial delivatives a~/ax and iJz/iJ.v at a variable point (x, y) are
functions of x and y. The function az/iJx is obtained as in ordinary calculus by
differentiating z = f(x, y) with respect to x. treating y as a constant, and Bz/By is obtained
by differentiating z with respect to y, treating x as a constant.
E X AMP L E 1
Let:::
2
= I(x. y) = x y
+ x sin y. Then
iiI
~
ilx
= 2n
+
~in'·.
.
ilj
2
= x
ily
~
+ \. COS
Y.
.
•
The partial derivatives iJ:;:/i)x and a~/ay of a function;: = f(x, y) have a very simple
geometric interpretation. The function ;:: = f(x, y) can be represented by a surface in
space. The equation y = Yl then represents a vertical plane intersecting the surface in a
curve. and the partial derivative a::Ji)x at a point (Xl' Yl) is the slope of the tangent (that
is, tan a where a is the angle shown in Fig. 556) to the curve. Similarly. the partial
derivative iJ;:/ay at (Xl, Yl) is the slope of the tangent to the curve X = Xl on the surface
z = f(x, y) at (Xl' YI)'
Fig. 556.
Geometrical interpretation of first partial derivatives
A68
APP. 3
Auxiliary Material
The partial derivatives azlax and a-::.My are called first partial derivatives or partial
derivatives of first order. By differentiating these derivatives once more, we obtain the
four second partial derivatives (or partial derivatives of second order)2
a2 f
a
ax 2
ax
a2f
a
ax ay
ax
a2f
a
(3)
ayax
a2 f
ay2
(:~ ) = fxx
(::.)
= fyx
f
(a ) = fxy
ay
ax
a
aI'
f
-ay ) =fyy'
(a
It can be shown that if all the derivatives concerned are continuous, then the two mixed
partial derivatives are equal, so that the order of differentiation does not matter (see Ref.
[GR4] in App. 1), that is,
(4)
E X AMP L E 2
For the function in Example I.
fxx = 2y,
f xy
= 2x + cos Y =
f yx,
fyy = -x siny.
•
By differentiating the second partial derivatives again with respect to x and y,
respectively, we obtain the third partial derivatives or partial derivatives of the third
order of f, etc.
If we consider a function f(x, y, z) of three independent varutbles, then we have the
three first partial derivatives fAx, y, z), fy{x, y, z), and fz(x, y, z). Here Ix is obtained by
differentiating f with respect to x, treating both y and z as constallts. Thus, analogous to
(I), we now have
etc. By differentiating
derivatives of f, etc.
E X AMP L E 3
f.p
f y, fz again in this fashion we obtain the second partial
Let f(x, y, z) = x 2 + y2 + Z2 + xy eZ • Then
f"
=
2x
+ )' eZ ,
fy = 2y
+ x eZ ,
fxx = 2,
fxy = f y ,' = e
fyy = 2.
fyz = fzy
=
fz = 2z
+ xy eZ ,
Z
•
z
xe ,
•
2 LAUTIO]'l In the subscript notation the subscripts are written in the order in which we differentiate whereas
in the "iY' notation the order is opposite.
'
SEC. A3.3
A69
Sequences and Series
A3.3
Sequences and Series
See also Chap. 15.
Monotone Real Sequences
We call a real sequence xl, X2,
increasing, that is,
. • • ,Xn , .••
a monotone sequence if it is either monotone
or monotone decreasing, that is,
We call Xl,
for all n.
THE 0 REM 1
PROOF
X2, •••
a bounded sequence if there is a positive constant K such that
IXnl
<K
(f a real sequence is bounded and monOTOne, it converges.
Let Xl> X2' . • • be a bounded monotone increasing sequence. Then its terms are smaller
than some number B and, since Xl ~ Xn for all n. they lie in the interval Xl ~ X." ~ B.
which will be denoted by 1o, We bisect 1o; that is, we subdivide it into two parts of equal
length. If the right half (together with its endpoints) contains terms of the sequence, we
denote it by 11' If it does not contain terms of the sequence, then the left half of 10 (together
with its endpoints) is called 11 , This is the first step.
In the second step we bisect h, select one half by the same rule, and call it 12 , and so
on (see Fig. 557 on p. A 70).
In this way we obtain shorter and shorter intervals 1o, 11 , 12 , • • . with the following
prope11ies. Each 1m contains all In for n > 111. No term of the sequence lies to the right
of 1m , and, since the sequence is monotone increasing. all Xn with Il greater than some
number N lie in 1m; of course, N will depend on 111. in general. The lengths of the 1m
approach zero as 111 approaches infinity. Hence there is precisely one number, call it L,
that lies in all those intervals,3 and we may now easily prove that the sequence is
convergent with the limit L.
In fact, given an E > 0, we choose an I1l such that the length of 1m is less than E. Then
L and all the Xn with n > N(m) lie in 1m' and. therefore, IXn - LI < E for all those n.
This completes the proof for an increasing sequence. For a decreasing sequence the proof
is the same, except for a suitable interchange of "left" and "right" in the construction of
those intervals.
•
3 This statement seems to be obvious, but actually it is not; it may be regarded as an axiom of the real number
system in the following form. Let h. 12 , .•• be closed intervals such that each 1m contains all 1n with 11 > m.
and the lengths of the 1m approach zero as III approaches intinity. Then there is precisely one real number that
is contained in all those intervals. This is the so-called Cantor-Dedekind axiom, named after the German
mathematicians GEORG CANTOR (1845-1918). the creamr of set theory, and RICHARD DEDEKIND
(1831-1916), known for his fundamental work in nwnber theory. For further details see Ref. [GR2] in App. I.
(An interval 1 is said to be closed if its two endpoints are regarded as points belonging to J. It is said to be open
if the endpoints are not regarded as points of I.)
APP. 3
A70
Auxiliary Material
10
---------1
B
I
I~<~--I---:: I I'~"III. :1
Fig. 557.
Proof of Theorem 1
Real Series
THEOREM 2
Leibniz Test for Real Series
Let
Xl' X2, •••
be real and monotone decreasing to zero, that is,
(1)
lim
(b)
Xm
=
O.
1U----+X
Then the series witll terms of altematillg signs
converges, and for the remainder Rn after the nth term we have the estimate
(2)
PROOF
Let
Sn
so that
be the 11th partial sum of the series. Then, because of (1 a),
S2 ~ S3 ~ Sl'
Proceeding in this fashion, we conclude that (Fig. 558)
(3)
which shows that the odd partial sums form a bounded monotone sequence, and so do the
even partial sums. Hence. by Theorem L both sequences converge, say,
lim
n_x
S2n+l
= S,
I
2
Fig. 558.
S2n
= s*.
-X
2
IE
8
lim
n_x
8
C-" =:j
4
8
3
8
1
Proof of the Leibniz test
SEC. A3.4
Grad, Div, Curl, V2 in Curvilinear Coordinates
Now. since
S271+1 -
s -
s*
S2n
= lim
=
X2n+1'
A71
we readily see that Ob) implies
lim
S271+1 -
n~x
= n_x
lim (S271+1
S2n
n~x
-
= n_x
lim '2n+1 =
S2")
Hence s* = s. and the series converges with the sum s.
We prove the estimate (2) for the remainder. Since s" -
o.
s, it follows from (3) that
and also
By subtracting
S2n
and
respectively, we obtain
S2n-1'
In these inequalities, the first expression is equal to X2,,+1' the last is equal to -X2m and
the expressions between the inequality signs are the remainders R2n and R 2n - 1 . Thus the
inequalities may be written
•
and we see that they imply (2). This completes the proof.
A3.4
Grad, Div, Curl, V 2
in Curvilinear Coordinates
To simplify formulas we write Cartesian coordinates -' = Xl' Y = -'2' Z = -'3' We denote
curvilinear coordinates by qb q2, q3' Through each point P there pass three coordinate
surfaces q1 = const, q2 = COllSt. q3 = COllSt. They intersect along coordinate curves. We
assume the three coordinate curves through P to be orthogonal (perpendicular to each
other). We write coordinate transformations as
(I)
Corresponding transformations of grad, div, curl, and V2 can all be written by using
(2)
Next to Cartesian coordinates. most important are cylindrical coordinates
= ;: (Fig. 559a on p. A 72) defined by
q1
= r, q2 =
e.
lJ3
(3)
Xl
=
q1
cos
q2
= r cos
e,
and spherical coordinates tiI
(4)
Xl
=
q1
cos q2 sin
q3
=
r,
= r cos
-'3 =
=
=
e, q3 = 4> (Fig. 559b) defined by4
q2
q1
sin q2 =
e,
X2
e sin 4>,
q1
cos
-'2
q3 = r
=
cos
r
sin
q1
sin
q2
sin
q3
=
r sin
e sin 4>
4>.
4This is the notation used in calculus and in many other books. It is logical since in it. 8 play, the ,arne role
as in polar coordinates. C-\L'TIOM Some books interchange the roles of IJ and <p.
A72
APP. 3
Auxiliary Material
z
z
h~
e. z)
--
-z
e
x
(a)
Y
Y
r
Cylindrical coordinates
Fig. 559.
(b)
Spherical coordinates
Special curvilinear coordinates
In addition to the general formulas for any orthogonal coordinates qh
additional formulas for these important special cases.
Linear Element ds.
q2, Q3,
we shall give
In Cartesian coordinates,
(Sec. 9.5).
For the q-coordinates,
(5)
(5')
(Cylindrical coordinates).
For polar coordinates set d-;,2
=
o.
(5")
(Spherical coordinates).
Gradient. grad f = vf = [f Xl' f X2' f X) (partial derivatives; Sec. 9.7). In the
q-system, with u, v, w denoting unit vectors in the positive directions of the Ql, Q2, Q3
coordinate curves, respectively,
(6)
(6')
(6")
divF
=
V.F
1
ar
r
= Vf = - u + -
!!fad!
=""f
1 a
-:-{rF1 )
r ilr
= -
af
= -u
<0
(7')
ilf
gr ad!
a,.
af
aR
--'--v
(Cylindrical coordinates)
iJz
1
af
1 a!
+ - - -y + --w
rsin <I> i/O
I iJF
aF
iJO
az
+ _ ~ + __3
r
iiI
+-w
,. 0<1>
(Spherical coordinates).
(Cylindrical coordinates)
SEC. A3.4
Grad, Div, Curl,
(7")
,2
A73
in Curvilinear Coordinates
div F =
v· F =
I iJ
2 -;- (,.2 F1 )
r "r
I
iJF2
rsm </J
UV
I
iJ
+ - . - --:;-;- + - . - -.- (sin </J F3 )
(Spherical coordinates).
rsm </J rJ</J
(8')
(Cylindrical coordinates)
(8")
(Spherical coordinates).
Curl (Sec. 9.9):
]
(9)
curlF=,xF=--lz]h2 h 3
hlu
h2 v
h3W
a
a
a
aq1
aq2
aq3
hlFI
h2F2
h3 F 3
For cylindrical coordinates we have in (9) (as in the previous formulas)
and for spherical coordinates we have
hI = h,. = I,
1z2 = h" = qI sin q3 = r sin </J,
(
.. I
,,
oS.
"
1
4
APPENDIX
Additional Proofs
~
..
Section 2.6, page 73
Uniqueness 1
Assuming that the problem consisting of the ODE
PROOF OF THEOREM 1
)''' + p(x)y' + q(x)y = 0
(1)
and the two initial conditions
(3)
has two solutions Yllx) and
difference
on the interval 1 in the theorem, we show that their
Y2(X)
is identically zero on 1: then Yl == )'2 on 1, which implies uniqueness.
Since (I) is homogeneous and linear. y is a solution of that ODE on 1. and since
)"2 satisfy the !>ame initial conditions, y satisfies the conditions
(10)
=
0,
z(x)
=
ylxo)
y' (xo)
VI
and
= O.
We consider the function
y(x)2
+ y' (X)2
and its derivative
z'
=
2yy' + 2)"')"".
From the ODE we have
y"
=
,
-py - q)'.
By substituting this in the expression for z.' we obtain
(11)
Now, since y and
y' are real.
IThis proof was suggested by my colleague. Prof. A. D. Ziebur. In this proof we use formula numbers that
have not yet been used in Sec. 2.6.
A74
APP. 4
A75
Additional Proofs
From this and the definition of :: we obtain the two inequalities
(12)
(a)
2yy'
~ )'2
From (l2b) we have 2)')" ~
obtain
+
= ;::,
y'2
-z. Together, 12)')"]
~
-2qyy'
-2y)"'
(b)
1-2qyy'l
~
~ y2
+
y'2
= z.
z. For the last term in (II) we now
Iq112y/1 ~ Iqlz.
=
Using this result a" well as -p ~ Ipl and applying (I2a) to the term 2yy' in (11), we find
Since / 2 ~
)'2
+ )"2
= .<;,
from this we obtain
z'
~ (l
+ 21pl + Iql)z
or, denoting the function in parentheses by 11,
z'
(l3a)
~ 11::
for all x on 1
Similarly. from (11) and (12) it follows that
(l3b)
~ Z
+ 2Ip!;:: + Iq\::: = 11;::.
The inequalities (I3a) and (13b) are equivalent to the inequalities
(14)
.<;' -
h::
~ 0,
z' +
h.<; ~
o.
Integrating factors for the two expressions on the left are
and
The integrals in the exponents exist because Iz is continuous. Since Fl and F2 are positive.
we thus have from (14)
and
This means that F1z is non increasing and F 2 z is nondecreasing on I. Since ::(xo) = 0 by
(10). when x ~ Xo we thus obtain
and similarly, when
x ~ Xo,
Dividing by Fl and F2 and noting that these functions are positive. we altogether have
z~
· 1·IeS that;:: = y 2+' l'2
ThisImp
"'"
o.
z~o
0 on I. Hence y "'" 0 or YI "'" Y2 on I.
for all x on I.
•
A76
APP. 4
Additional Proofs
Section 5.4, pages 184
PROOF 0 F THE 0 REM 2
Frobenius Method. Basis of Solutions. Three Cases
The formula numbers in this proof are the same as in the text of Sec. 5.4. An additional
formula not appearing in Sec. 5.4 will be called (A) (see below).
The ODE in Theorem 2 is
b(x),
where
c(x)
)''' + -x
-·\' + -x )' = 0•
2
(1)
b(,)
and c(x) are analytic functions. We can write it
x 2 y"
(1')
+ xb(x)),' +
dx)),
+
Co
= O.
The indicial equation of (l) is
(4)
r(r - 1)
bor
+
= O.
The roots 1"1' 1"2 of this quadratic equation determine the general form of a basis of solutions
of 0). and there are three possible cases as follows.
Case 1. Distinct Roots not Differing by an Integer.
form
A first solution of 0) is of the
(5)
and can be determined as in the power series method. For a proof that in this case, the
ODE (1) has a second independent solution of the form
(6)
see Ref. [All] listed in App. 1.
Case 2. Double Root. The indicial equation (4) has a double root r if and only if
(b o - 1)2 - 4co = 0, and then r = !( I - b o). A first solution
(7)
r
=
!(l -
boY,
can be determined as in Case I. We show that a second independent solution is of the
form
(x> 0).
(8)
We use the method of reduction of order (see Sec. 2.1), that is, we determine ll(X) such
that )'2(X) = U(X)Yl(X) is a solution of (I). By inserting this and the derivatives
=
""
)'2
II )'1
+
I
,
2u Yl
+ UYIII
into the ODE (I') we obtain
x
2("
U)'1
+')"
_U .'"1 +
" + xb(ll ,)'1 + UYI)
, +
UYI)
cUYI
= O.
APP. 4
A77
Additional Proofs
Since )'1 is a solution of (I r), the sum of the terms involving u is zero, and this equation
reduces to
By dividing by
2
X )'1
and inserting the power series for b we obtain
u
"
(
+ 2 -)'1, + -b o + . ..
x
Yt
)
u
,
o.
=
Here and in the following the dots designate terms that are constant or involve positive
powers of x. Now from (7) if follows that
y~
+ l)alx + ... ]
x [aO+alX +···]
xT - 1 ll"ao
+
(I"
T
Yt
I"
x
+
Hence the previous equation can be written
(A)
U rr
+
(21": bo
+ ... )
o.
Ur =
Since r = (l - b o)/2, the term (21" + bo)/x equals I/x, and by dividing by u' we thus
have
u"
u
,
x
+
By integration we obtain In u' = -In x + ... , hence u' = (Ilx)e<·· .J. Expanding the
exponential function in powers of x and integrating once more, we see that u is of the
form
Inserting this into)'2
=
UY1,
we obtain for )'2 a representation of the form (8).
Case 3. Roots Differing by an Integer.
positive integer. A first solution
We write 1"1 =
I"
and 1"2 =
I" -
P where p is a
(9)
can be determined as in Cases 1 and 2. We show that a second independent solution is
of the form
(10)
where we may have k -=I=- 0 or k = O. As in Case 2 we set)'2
literally as in Case 2 and give Eq. (A),
u"
+
(21": b
o
+ .. -)
u'
= O.
=
1t)'1.
The first steps are
APP.4
A7B
Additional Proofs
Now by elementary algebra. the coefficient b o - I of r in (4) equals minus the sum of
the roots,
=
bo - I
Hence 2r
+ r2) = -(r + r - p) = -2r + p.
-(r1
+ bo = p + L and division by u' gives
The further ')teps are as in Case 2. Integrating, we find
In u'
= -(p
+ 1) In x + ... ,
thus
I
u = x
-(p+ll ( ... J
e
where dots stand for some series of nonnegative integer powers of x. By expanding the
exponential function as before we obtain a series of the form
I
U
We integrate once more. Writing the resulting logarithmic term first, we get
u = k Inx
p
+ (- _1_P - ... - kp- 1 + kp+IX + ... )
Hence. by (9) we get for Y2 =
px
UYI
x
the formula
But this is of the form (10) with k = kp since rl - P = r2 and the product of the two
•
series involves nonnegative integer powers of x only.
Section 5.7, page 205
THEOREM
Reality of Eigenvalues
If p, q, r, alld p'
ill the Sturm-Liouville equation (I) of Sec. 5.7 are real-valued and
continuous on the interval a ~ x ~ band rex) > 0 throughout that interval (or
rex) < 0 throughout that interval). then all the eigenvalues of the Stunl1-Liouville
problem (1). (2). Sec. 5.7. are real.
PROOF
Let A =
0'
+ i{3 be an eigenvalue of the problem and let
y(x) = u(x)
be a corresponding eigenfunction: here
Sec. 5.7, we have
(
pu I
0',
+ iu(x)
{3. u. and u are real. Substituting this into (1),
+.lpU ')' + (q +
O'r
+ i{3r)(u + iu) = o.
APP. 4
Additional Proofs
A79
This complex equation is equivalent to the following pair of equations for the real and
the imaginary parts:
(pu')'
+
(q
+
ar)u - {3rv
=
0
(pU')'
+
(q
+
ar)u
+ 13m
=
O.
Multiplying the first equation by u, the second by
-(3(1I 2
+
u 2 )r
-ll
and adding, we get
= u(pu')' - u(pu')'
= [(pu')lI - (pu')uJ'.
The expression in brackets is continuous on a ~ x ~ b. for reasons similar to those in
the proof of Theorem I, Sec. 5.7. Integrating over x from a to b. we thus obtain
Because of the boundary conditions the right side is zero; this is as in that proof. Since y
is an eigenfunction, u 2 + u2
O. Since y and r are continuous and r > 0 (or r < 0) on
the interval a ~ x ~ b, the integral on the left is not zero. Hence, (3 = 0, which means
that A = a is reaL This completes the proof.
•
*'
Section 7.7, page 308
THEOREM
Determinants
The definition of a detennillant
(7)
D
= detA =
as given in Sec. 7.7 is unambiguous, that is, it yields the same value of D no matter
which rows or columns we choose in developings.
PROOF
In this proof we shall use fonnula numbers not yet used in Sec. 7.7.
We shall prove first that the same value is obtained no matter which row is chosen.
The proof is by induction. The statement is true for a second-order determinant. for
which the developments by the first row aU{/22 + a 1 2( -(21) and by the second row
a21(-a12) + {/22 a ll give the same value alla 22 - a12a21' Assuming the statement to be
true for an (n - l)st-order determinant, we prove that it is true for an nth-order
determinant.
ABO
APP. 4
Additional Proofs
For this purpose we expand D in terms of each of two arbitrary rows, say, the ith and
the jth, and compare the results. Without loss of generality let us assume i < j.
First Expansioll.
We expand D by the ith row. A typical term in this expansion
i~
The minor Mik of aik in D is an (11 - 1)st-order determinant. By the induction hypothesis
we may expand it by any row. We expand it by the row corresponding to the jth row of
D. This row contains the entries ajl (/ =1= k). It is the (j - I )st row of M ik • because Mik
does not contain entries of the ith row of D. and i < j. We have to distinguish between
two ca<;es as follows.
Case I. If I < k, then the entry ajl belongs to the lth column of Mik (see Fig. 560). Hence
the term involving ajl in this expansion is
(20)
where M ikj / is the minor of (ljl in M ik . Since this minor is obtained from Mik by deleting
the row and column of ajl, it is obtained from D by deleting the ith and jth rows and the
kth and lth columns of D. We insert the expansions of the Mik into that of D. Then it
follows from (19) and (20) that the terms of the resulting representation of D are of the
form
(2Ia)
(l
<
k)
where
b=i+k+j+l-l.
11. If I > k, the only difference is that then ajl belongs to the (l - I )st column of
because Mik does not contain entries of the kth column of D, and k < I. This causes
an additional minus sign in (20). and. instead of (21 a). we therefore obtain
Case
M ik ,
(2Ib)
(l
where b is the same as before.
lth
kth
kth
lth
col.
col.
col.
col.
I
I
I
I
--~------@---I
ith row
I
--6}-----~---I
I
I
I
I
I
Case I
Fig. 560.
-
-&-----~--I
jth row
I
---+------@--I
I
I
I
I
I
Case II
Cases I and II of the two expansions of D
> k)
APP. 4
A81
Additional Proofs
Second Expansion.
expansion is
We now expand D at first by the jth row. A typical term in this
(22)
By the induction hypothesis we may expand the minor Mjl of ajl in D by its ith row, which
corresponds to the ith row of D. since j > i.
> I, the entry 0ik in that row belongs to the (k - I )st column of Mjl' because
does not contain entIies of the Ith column of D. and I < k (see Fig. 560). Hence the
term involving aik in this expansion is
Case T. If k
Mjl
(23)
aik·
· Mjl)
(cof actor 0 f aik In
= llik·
(-
l)i+(k- D M ikjZ,
where the minor M ikjl of aik in Mjl is obtained by deleting the ith and jth rows and the
kth and Ith columns of D [and is, therefore, identical with M ikj /, in (20), so that our notation
is consistentJ. We insert the expansions of the Mjl into that of D. It follows from (22) and
(23) that this yields a representation whose terms are identical with those given by (21 a)
when I < k.
< I, then 0ik belongs to the kth column of Mjl' we obtain an additional minus
sign, and the result agrees with that characterized by (21 b).
Case 1I. If k
We have shown that the two expansions of D consist of the same terms, and this proves
our statement concerning rows.
The proof of the statement concerning colu11l1ls is quite similar; if we expand D in
terms of two arbitrary columns, say, the kth and the !th. we find that the general term
involving 0jlaik is exactly the same as before. This proves that not only all column
expansions of D yield the same value, but also that their common value is equal to the
common value of the row expansion), of D.
This completes the proof and shows that ollr definitioll of all mil-order detel7ninalll is
unambiguolls.
•
Section 9.3, page 377
PROOF OF FORMULA (2)
We prove that in right-handed
Carte~ian
coordinates. the vector product
has the components
(2)
We need only consider the case v =1= O. Since v is perpendicular to both a and b, Theorem
1 in Sec. 9.2 gives a • v = 0 and b • v = 0; in components [see (2), Sec. 9.2],
(3)
A82
APP. 4
Additional Proofs
Multiplying the first equation by b3 , the last by
a3.
and subtracting, we obtain
Multiplying the first equation by b l , the last by
ill'
and subtracting, we obtain
We can
ea~ily
verify that these two equations are
~atisfied
by
(4)
where c is a constant. The reader may verify by inserting that (4) also satisfies (3). Now
each of the equations in (3) represents a plane thruugh the origin in VIV2v3-space. The
vectors a and b are normal vectors of these planes (see Example 6 in Sec. 9.2). Since
v =1= 0, these vectors are not parallel and the two planes do not coincide. Hence their
intersection is a straight line L through the origin. Since (4) is a solution of (3) and, for
varying c, represents a straight line, we conclude that (4) represents L, and every solution
of (3) must be of the form (4). Tn particular, the components of v must be of this form,
where c is to be determined. From (4) we obtain
This can be written
as can be verified by performing the indicated multiplications in both formulas and
comparing. Using (2) in Sec. 9.2, we thus have
By comparing this with formula (12) in Team Project 24 of Problem Set 9.3 we conclude
that c = ±1.
We show that c = + 1. This can be done a" follows.
If we change the lengths and directions of a and b continuously and so that at the end
a = i and b = j (Fig. l86a in Sec. 9.3), then v will change its length and direction
continuously, and at the end, v = i X j = k. Obviously we may effect the change so that
both a and b remain different from the zero vector and are not parallel at any instant.
Then v is never equal to the zero vector, and since the change is continuous and c can
only assume the values + 1 or -I, it follows that at the end c must have the same value
as before. Now at the end a = i, b = j. v = k and, therefore, al = 1. b2 = I, V3 = L
and the other components in (4) are zero. Hence from (4) we see that V3 = c = + 1. This
proves Theorem I.
For a left -handed coordinate system, i X j = -k (see Fig. 186b in Sec. 9.3), resulting
in c = -1. This proves the statement right after formula (2).
•
APP. 4
A83
Additional Proofs
Section 9.9, page 416
PROOF OF THE INVARIANCE OF THE CURL
This proof will follow from two theorems (A and B), which we prove first.
THEOREM A
Transformation Law for Vector Components
For any vector v the componenTs VI, V2 , V3 and Vl*' V2*' V3* in allY two systems
of Cartesian coordinates Xl> x3, X3 and Xl*' X2*' x3*' respectively, are related by
0)
and conversely
(2)
with coefficients
C 13
= i*·k
(3)
C31
= k*·i
C32
=
c 33 = k*·k
k*· j
satisfying
3
(4)
L
CkjCmj =
i5km
(k,1I/
= 1, 2, 3),
j~1
where the Kronecker delta2 is given by
(k
* m)
(k
= 11/)
and i, j, k and i*, j*, k* denote the unit vectors ill the positive
X2*-' x3*-directions, respectively.
Xl-,
X2-, X3- and
Xl*-'
2LEOPOLD KRONECKER (l823-18YI), German mathematician at Berlin, who made important
contributions to algebra. group theory. and number theory.
We shall keep our discussion completely independent of Chap. 7, but readers familiar with matrices should
rccognize that we are dealing with orthogonal transformations and matrices and that our present theorem
follows from Theorem 2 in Sec. 8.3.
APP. 4
A84
PROOF
Additional Proofs
The representation of v in the two systems are
(5)
Since i* • i* = L i* • j*
from this and (Sa)
= 0, i* • k* = 0, we get from (5b) simply i* • v =
Vl*
and
Because of (3), this is the first formula in (1), and the other two fommlas are obtained
similarly. by considering j* • v and then k* • v. Formula (2) follows by the same idea.
taking i • v = VI from (Sa) and then from (5b) and (3)
and similarly for the other two components.
We prove (4). We can write (1) and (2) briefly as
3
(6)
(a)
Vj
3
= ~ cl1lJ v m *,
(b)
Vk
*=
Vj
into
Vk *,
CkjVj.
j=1
m=1
Substituting
~
we get
3 3 3
Vk*
=~
Ckj
j=1
~
CmjV n
.* = ~
m=1
Vm *
m=l
where k = 1, 2, 3. Taking k = 1, we have
For this to hold for eve1), vector v, the first sum must be I and the other two sums O. This
proves (4) with k = 1 for m = 1, 2, 3. Taking k = 2 and then k = 3. we obtain (4) with
•
k = 2 and 3, form = 1,2,3.
THEOREM B
Transformation Law for Cartesian Coordinates
The trclllc~f01111{{tioll of allY Cartesian XIX2x3-coordinate system into any other
Cartesian XI *X2 *X3 *-coordillate system is of the fonn
3
(7)
Xm*
= ~ CnljXj + b m ,
111
= 1.2.3,
j=1
with coefficients (3) and COllstants bI> b 2, b3; cOllversely,
3
(8)
Xk
=
:L
1£=1
CnkXn
* + bk ,
k
=
1.2,3.
APP. 4
A8S
Additional Proofs
Theorem B follows from Theorem A by noting that the most general transformation of a
Cartesian coordinate system into another such system may be decomposed into a
tranSf0I111ation of the type just considered and a translation; and under a translation,
cOlTesponding coordinates differ merely by a constant.
PROOF OF THE INVARIANCE OF THE CURL
We write again Xl, X2, X3 instead of x, y,~, and similarly Xl*'
X2*' X3i< for other Cartesian
coordinates. assuming that both systems are right-handed. Let al. a 2 • 03 denote the
components of curl v in the xIx2x3-coordinates. as given by (l), Sec. 9.9. with
v=X2,
.
Similarly, let al*' a2*' a 3* denote the components of curl v in the xl*x2*x3*-coordinate
system. We prove that the length and direction of curl v are independent of the particular
choice of Cartesian coordinates. as asserted. We do this by showing that the components
of curl v satisfy the transformation law (2), which is characteristic of vector components.
We consider {[I' We use (6a), and then the chain rule for functions of several variables
(Sec. 9.6). This gives
3
=
3
L L
(
aV.,/
c m3 (Jx/'
m=I j=l
From this and (7) we obtain
Note what we did. The double sum had 3 X 3 = 9 terms, 3 of which were zero (when
= j). and the remaining 6 terms we combined in pairs as we needed them in getting
a I *, a2*' {/3*
111
We now use (3), Lagrange's identity (see Team Project 24 in Problem Set 93) and
k* x j* = -i* and k X j = -i. Then
= (k* x j*) • (k x j) = i* • i = Cn.
etc.
A86
APP. 4
Additional Proofs
Hence a] = clla]* + c2]a2* + c3]a3*' This is of the form of the first formula in (2) in
Theorem A, and the other two formulas of the form (2) are obtained similarly. This proves
the theorem for right-handed systems. If the .\"lx2'\·3-coordinates are left-handed, then
k X j = +i, but then there is a minus sign in front of the determinant in (1), Sec. 9.9. •
Section 10.2, pages 426-427
PROOF 0 F THE 0 REM 1, PA R T (b)
f
(1)
We prove that if
f
F(r) • dr =
(Fl dx
c
C
+
F2 dy
+
F3 dz)
with continuous F I , F 2, F3 in a domain D is independent of path in D, then F = grad f
in D for some f; in components
(2' )
We choose any fixed A: (xo, Yo, zo) in D and any B: (x, )" z) in D and define f by
(3)
f(x, y, z)
=
fo
+
I
B
(F] dx*
+
F2 dy*
+
F3 dz*)
A
with any constant fo and any path from A to BinD. Since A is fixed and we have
independence of path. the integral depends only on the coordinates x. y. z. so that (3)
defines a function f(x, y. z) in D. We show that F = grad f with this f, beginning with
the first of the three relations (2'). Because of independence of path we may integrate
from A to B]: (x], y, z) and then parallel to the x-axis along the segment B]B in Fig. 561
with B] chosen so that the whole segment lies in D. Then
f(x, y, z) = fo
+
I
Bl
(FI d.x:*
+
F2 dy*
A
+
F3 dz*)
+
f
B
(FI dx*
+
F2 dy*
+
F3 dz*).
~
We now take the partial derivative with respect to x on both sides. On the left we get
iJf/iJx. We show that on the right we get F]. The derivative of the first integral is zero
because A: (xo, Yo, zo) and B 1 : (x], y, z) do not depend on x. We consider the second
integral. Since on the segment B]B, both y and z are constant, the terms F2 dy* and
z
y
x
Fig. 561.
Proof of Theorem 1
APP.4
A87
Additional Proofs
F3 d::.* do not contribute to the detivative of the integral. The remaining part can be written
as a definite integral.
Hence its partial derivative with respect to x is Fl(X, y, ::.), and the first of the relations
(2') is proved. The other two formulas in (2') follow by the same argument.
•
Section 13.4, page 620
Cauch~-Riemann Equations
We prove that Cauchy-Riemann equations
PROOF 0 F THE 0 REM 1
(1)
are sufficient for a complex function fez) = u(x, y) + iv(x, y) to be analytic; precisely. (f
the real parI u and the inzaginaJ~v part v of f(z.) satisfy (I) ill a domain D ill the complex
plane and if the ponied derivatives i11 (I) are COlltillllOUS in D, then fez) is analytic in D.
In this proof we write D.::. = .lx
is as follows.
+ iD.y and D.f = fez. + D.z.) - f(z.).
The idea of proof
(a) We express .If in terms of first partial derivatives of II and v. by applying the mean
value theorem of Sec. 9.6.
(b) We get rid of partial derivatives with respect to y by applying the Cauchy-Riemann
equations.
(c) We let .l.:: approach zero and show that then D.fl.l::. as obtained approaches a limit
which is equal to U x + iv x , the right side of (4) in Sec. 13.4. regardless of the way of
approach to zero.
(a) Let P: (x, y) be any fixed point in D. Since D is a domain, it contains a neighborhood
of P. We can choose a point Q: (x + D.x, y + D.)') in this neighborhood such that the
straight-line segment PQ is in D. Because of our continuity a~sumptions we may apply
the mean value theorem in Sec. 9.6. This yields
u(x
+
v(x
+ D.x, y + .ly) - vex, y)
D.x, y
+
.ly) -
lI(X,
y)
= (.lx)ux(M 1 ) + (D.y)uyCM 1 )
=
(D.x)v x (M 2)
+ (D.y)vyCM2)
where Ml and M2 (01= Ml in general!) are suitable points on that segment. The first line
is Re D.f and the second is 1m .If, so that
(b)
uy
=
-vx
and
Vy
=
IIx
by the Cauchy-Riemann equations, so that
A88
APP. 4
Additional proofs
Also 11::. =
11.\' = 0::. -
~x
+ il1.\', so that we can write I1x = ~::. - il1."'" in
= -i(.1.:: - .1x) in the second term. This gives
the first term and
~x)li
By performing the multiplications and reordering we obtain
111 =
(.1::')lI x (M l ) - iI1Y{lIx (Ml) - lI x (M2 )}
+
i[(~::.Wr(Ml) -
..lx{ux(Ml ) - u x(M2 )}]·
Division by 11::. now yields
i:lx
i.1y
.1~
.
{lI x (Ml) -
~-
tl x(M2 )} -
-
{ux(M l ) - u x(M2 )}·
(e) We finally let 11::. approach zero and note that 111-,,111::.1 ~ I and Il1xll1zl ~ I in (A).
Then Q: (x + ~x, y + ~y) approaches P: (x, y), so that Ml and M2 must approach P.
Also, since the partial derivatives in (A) are assumed to be continuous, they approach
their value at P. In particular, the differences in the braces {... } in (A) approach zero.
Hence the limit of the right side of (A) exists and is independent of the path along which
11::. ---7 O. We see that this limit equals the right side of (4) in Sec. 13.4. This means that
1(::.) is analytic at every point.:: in D, and the proof is complete.
•
Section 14.2, pages 647-648
Goursat proved Cauchy's
integml theorem without assuming that f' (.::) is continuous, as follows.
We start with the case when C is the boundary of a triangle. We orient C
counterclockwise. By joining the midpoints of the sides we subdivide the triangle into
four congruent triangles (Fig. 562). Let CI . C n . C m . CIV denote their boundaries. We
claim that (see Fig. 562).
GOURSAT'S PROOF OF CAUCHY'S INTEGRAL THEOREM
(I)
fI
G
d.::
=
f
G,
I
d.::
+
f
Gn
I
dz
+
f
GIll
I
d.::
+
f
I
d::..
G,v
Indeed, on the right we integrate along each of the three segments of subdivision in both
possible directions (Fig. 562), so that the corresponding integrals cancel out in pairs, and
the sum of the integrals on the right equals the integral on the left. We now pick an integral
on the right that is biggest in absolute value and call its path Cl' Then. by the triangle
inequality (Sec. 13.2),
Fig. 562.
Proof of Cauchy's integral theorem
APP. 4
A89
Additional proofs
We now subdivide the triangle bounded by C 1 as before and select a triangle of
subdivil>ion with boundary C2 for which
Then
Continuing in this fashion, we obtain a sequence of triangles T 1 , T2 , ••• with boundaries
Cl> C2 , . • • that are similar and such that Tn lies in Tm when 11 > Ill, and
n = 1,2, .. '.
(2)
Let ':::0 be the point that belongs to all these triangles. Since
the derivative J' (':::0) exists. Let
(3)
h(.:::)
=
fez) - f(zo)
z -:'::0
f is differentiable at :.:: =
:'::0,
J' (zo)·
-
Solving this algebraically for f(:.::) we have
fez) = f(:.::o)
+ (z
- zo)J' (:'::0)
+
11(:::)(:.:: -
':::0)·
Integrating this over the boundary Cn of the triangle Tn gives
Since f(.:::o) and J' (zo) are constants and Cn is a dosed path. the first two integrals on the
right are zero, as follows from Cauchy's proof, which is applicable because the integrands
do have continuous derivatives (0 and const, respectively). We thus have
f
fez) d:::
en
=
f
h(z)(z -
:'::0)
d:.::.
en
Since J' (:'::0) is the limit of the difference quotient in (3), for given
8 > 0 such that
(4)
Ih(z)1 < e
when
Iz -
E
> 0 we can find a
:'::01 < 8.
We may now take 11 so large that the tliangle Tn lies in the disk Iz - :::01 < 8. Let Ln be
the length of Cn' Then I::: - zol < Ln for all :: on Cn and::oin Tn. From this and (4) we
have 111(.:::)(z - :(0)1 < eLn . The ML-inequality in Sec. 14.1 now gives
(5)
Now denote the length of C by L. Then the path C1 has the length Ll = Ll2, the path C2
has the length ~ = Ll/'2 = Ll4, etc., and Cn has the length L" = Ll2n. Hence
Ln 2 = L2/4n. From (2) and (5) we thus obtain
A90
APP. 4
Additional Proofs
By choosing E (> 0) sufficiently small we can make the expression on the right as small
as we please, while the expression on the left is the definite value of an integral.
Consequently. this value must be zero, and the proof is complete.
The proof for the case ill which C is the boundary of a polygon follows from the
previous proof by subdividing the polygon into triangles (Fig. 563). The integral
corresponding to each such triangle is zero. The sum of these integrals is equal to the
integral over C, because we integrate along each segment of subdivision in both
directions, the corresponding integrals cancel out in pairs, and we are left with the integral
over C.
The case of a general simple closed path C can be reduced to the preceding one by
inscribing in C a closed polygon P of chords, which approximates C "sufficiently
accurately," and it can be shown that there is a polygon P such that the integral over P
differs from that over C by less than any preassigned positive real number E, no matter
how small. The details of this proof are somewhat involved and can be found in Ref. [D6]
listed in App. 1.
•
Fig. 563.
Proof of Cauchy's integral theorem for a polygon
Section 15.1, page 667
PROOF 0 F THE 0 REM 4
Cauchy's Convergence Principle for Series
(a) [n this proof we need two concepts and a theorem, which we list first.
1. A bounded sequence SI' S2, • • . is a sequence whose terms all lie in a disk of
(sufficiently large, finite) radius K with center at the origin; thus ISnl < K for alln.
2. A limit point a of a sequence Sb S2, . • • is a point such that, given an E> 0, there
are infinitely many terms satisfying ISn - al < E. (Note that this does not imply
convergence, since there may still be infinitely many tenns that do not lie within that
circle of radius E and center a.)
Example: ~. ~, ~. ~, 1~' ~~ • . . . has the limit points 0 and 1 and diverges.
3. A bounded sequence in the complex plane has at least one limit point.
(Bolzano-Weierstrass theorem: proof below. Recall that "sequence" always mean infinite
sequence.)
(b) We now turn to the actual proof that
every E > 0 we can find an N such that
(1)
Izn+l
ZI
+ ... + zn+pl <
E
+
~2
+ ...
converges if and only if for
for every n > Nandp
Here, by the definition of partial sums,
Sn+p -
Sn
=
2n+l
+ ... + zn+p'
= 1,2, . ".
APP. 4
A91
Additional proofs
Writing
II
+P=
r, we see from this that (1) is equivalent to
for all r > Nand n
(1*)
>
N.
Suppose that SI' S2, ... converges. Denote its limit by s. Then for a given E> 0 we can
find an N such that
for every n
Hence, if r > Nand n
>
>
N.
N, then by the triangle inequality (Sec. 13.2),
that is, (I *) holds.
(c) Conversely, assume that SI, S2, • . . satisfies (1 *). We first prove that then the
sequence must be bounded. Indeed, choose a fixed E and a fixed n = no > N in (1 *).
Then (1 *) implies that all s,. with r > N lie in the disk of radius E and center s"o and only
fillitely many tel7l1S SI, ... , SN may not lie in this disk. Clearly, we can now find a circle
so large that this disk and these finitely many terms all lie within this new circle. Hence
the sequence is bounded. By the Bolzano-Weierstrass theorem, it has at least one limit
point, call it s.
We now show that the sequence is convergent with the limit s. Let E > 0 be given.
Then there is an N* such that 1ST - snl < E/2 for all r > N* and 11 > N*, by (1 *). Also,
by the definition of a limit point, ISn - sl < E/2 for infinitely lI1allY n, so that we can find
and fix an Il > N* such that ISn - sl < El2. Together, for e\'el)' r > N*,
ST -
1
sl =
I(ST - S.,}
+ (Sn -
s)1
~
Is,. -
snl
+ ISn -
sl
E
E
< 2 + 2 =
E;
that is. the sequence SI, S2' ... is convergent with the limit s.
THEOREM
•
Bolzano-Weierstrass Theorem 3
A bounded infillite sequellce Z1> Z2, 23, . . • in the complex plane has at least one
limit point.
PROOF
It is obvious that we need both conditions: a finite sequence cannot have a limit point.
and the sequence I, 2, 3.... , which is infinite but not bounded. has no limit point. To
prove the theorem, consider a bounded infinite sequence ZI. Z2 • ... and let K be such that
< K for all n. If only finitely many values of the Zn are different, then. since the
sequence is infinite, some number z must occur infinitely many times in the sequence,
and, by definition, this number is a limit point of the sequence.
We may now tum to the case when the sequence contains infinitely many differem
terms. We draw a large square Qo that contains all Zw We subdivide Qo into four congruent
squares, which we number 1, 2. 3, 4. Clearly, at least one of these squares (each taken
Iznl
3BERNARD BOLZANO (1781-1848). Austrian mathematician and professor of religious studies, was a
pioneer in the study of point sets, the foundation of analysis, and mathematical logic.
For Weierstrass. see Sec. 15.5.
A92
APP. 4
Additional Proofs
with its complete boundary) must contain infinitely many terms of the sequence. The
square of this type with the lowest number (1. 2, 3, or 4) will be denoted by Q1' This is
the first step. In the next step we subdivide Q1 into four congruent squares and select a
square Q2 by the same rule, and so on. This yields an infinite sequence of squares Qo.
Q1, Q2, ... , Qn, ... with the property that the side of Qn approaches zero as 11 approaches
infinity, and Qm contains all Qn with 11 > m. It is not difficult to see that the number
which belongs to all these squares,4 call it:::: = a, is a limit point of the sequence. In fact,
given an E > O. we can choose an N so large that the side of the square QN is less than
€ and, since QN contains infinitely many Zn. we have Izn - aJ < E for infinitely many 11.
This completes the proof.
•
Section 15.3, pages 681-682
T (b) OF THE PROOF OF THEOREM 5
We have to show that
=
L
an LlZ[(Z
+
+
LlZ)",-2
2z(z
+
uz)n-3
+ ... +
(11 -
1) zn-2],
11.=2
thus,
(z
+
LlZ)n - zn
Llz
= LlZ[(Z + Llz)n-2 +
If we set
Z
+ .1z =
band z
2z(z
+
LlZ)n-3
= a, thuf, ,lz =
+ ... +
(n -
l)z11.-2].
h - a, this becomes simply
(7a)
(11
=
2,3, ... ),
where An is the expression in the brackets on the right,
(7b)
thus, A2 = 1, A3 = b
since then
+
2a. etc. We prove (7) by induction. When n
- 2a
=
(b
+
a)(b - a)
b-a
- 2a
= 2. then
(7) holds.
= b - a = (b - a)A 2 .
Assuming that (7) holds for 11 = k, we show that it holds for n = k + 1. By adding and
subtracting a term in the numerator and then dividing we first obtain
bk + 1
b-a
-
ba k
+
bak
-
ak + 1
b-a
4The fact that such a unique number;:; = a exists seems [0 be obvious, but it actually follows from an axiom
of the real number system, the so-called CantOl~Dedekind axiom: see footnote 3 in App. A3.3.
APP. 4
Additional Proofs
A93
By the induction hypothesis, the right side equals b[(b - a)Ak
calculation shows that this is equal to
From (7b) with
Il
+
ka k - 1 ]
+
a k . Direct
= k we see that the expression in the braces {... } equals
bk - 1
+
2ab k - 2
+ ... +
bk + 1
ak + 1
(k - l)ba k - 2
+
ka k - 1 = A k + 1 •
+
(k
+
Hence our result is
_
b-a
= (b - a)A k + 1
Taking the last term to the left, we obtain (7) with n
integer n ~ 2 and completes the proof
=
k
+
l)a k .
I. This proves (7) for any
•
Section 18.2, page 754
without the use of a harmonic conjugate
A NOT HER PROOF 0 F THE 0 REM 1
We show that if w = u + iu = i(z) is analytic and maps a domain D conformally onto
a domain D* and <I>*(u, u) is harmonic in D*, then
(1)
<I>(x, y) = <I>*(u(x, y), u(x, y))
is harmonic in D. that is, y2<1> = 0 in D. We make no use of a hmmonic conjugate of
<1>*, but use straightforward differentiation. By the chain rule,
We apply the chain rule again. underscoring the terms that will drop out when we form
,2<1>:
<l>yy is the same with each x replaced by y. We form the sum y2<1>. In it, <I>~u
=
<I>~v is
multiplied by
which is 0 by the Cauchy-Riemann equations. Also y 2 U = 0 and y 2 U = O. There remains
By the Cauchy-Riemann equations this becomes
and is 0 since <1>* is harmonic.
•
APPENDIX
5
Tables
For Tables of Laplace transforms see Secs. 6.8 and 6.9.
For Tables of Fourier transforms see Sec. 11.10.
If you have a Computer Algebra System (CAS), you may not need the present tables,
but you may still find them convenient from time to time.
Table Al
Bessel Functions
For more extensive tables see Ref. [GRll in App. I.
x
1 0(x)
1 1 (x)
x
1 0(:0:)
0.0
0.1
0.2
0.3
0.4
1.0000
0.9975
0.9900
0.9776
0.9604
0.0000
0.0499
0.0995
0.1483
0.1960
3.0
3.1
3.2
3.3
3.4
-0.2601
-0.2921
-0.3202
-0.3443
-0.3643
0.3391
0.3009
0.2613
0.2207
0.1792
6.0
6.1
6.2
6.3
6.4
0.1506
0.1773
0.2017
0.2238
0.2433
-0.2767
-0.2559
-0.2329
-0.2081
-0.1816
0.5
0.6
0.7
0.8
0.9
0.9385
0.9120
0.8812
0.8-1-63
0.8075
0.2423
0.2867
0.3290
0.3688
0.4059
3.5
3.6
3.7
3.8
3.9
-0.3801
-0.3918
-0.3992
-0.4026
-0.4018
0.1374
0.0955
0.0538
0.0118
-0.0272
6.5
6.6
6.7
6.8
6.9
0.2601
0.2740
0.2851
0.2931
0.2981
-0.1538
-0.1250
-0.0953
-0.0652
-0.0349
1.0
1.1
1.2
1.4
0.7652
0.7196
0.6711
0.6201
0.5669
0...\401
0.4709
OA983
0.5220
0.5419
-l.0
4.1
4.2
4.3
4.4
-0.3971
-0.3887
-0.3766
-0.3610
-0.3423
-0.0660
-0.1033
-0.1386
-0.1719
-0.2028
7.0
7.1
7.2
7.3
7.4
0.3001
0.2991
0.2951
0.2882
0.2786
-0.0047
0.0152
0.0543
0.0826
0.1096
1.5
1.6
1.7
1.8
1.9
0.5118
0.4554
0.3980
0.3400
0.2818
0.5579
0.5699
0.5778
0.5815
0.5812
4.5
-l.6
4.7
4.8
4.9
-0.3205
-0.1961
-0.1693
-0.2404
-0.2097
-0.2311
-0.2566
-0.2791
-0.2985
-0.3147
7.5
7.6
7.7
7.8
7.9
0.2663
0.2516
0.2346
0.2154
0.1944
0.1352
0.1592
0.1813
0.2014
0.2192
2.0
2.1
2.2
2.3
2.4
0.2239
0.1666
0.1104
0.0555
0.0025
0.5767
0.5683
0.5560
0.5399
0.5202
5.0
5.1
5.2
5.3
5.4
-0.1776
-0.1443
-0.1103
-0.0758
-0.0412
-0.3276
-0.3371
-0.3431
-0.3460
-0.3453
8.0
8.1
8.2
8.3
8.4
0.1717
0.1475
0.1222
0.0960
0.0692
0.2346
0.2476
0.2580
0.2657
0.2708
2.5
2.6
2.7
2.8
2.9
-0.0484
-0.0968
-0.1424
-0.1850
-0.2143
0.4971
0.-1-708
0.4416
0.4097
0.3754
5.5
5.6
5.7
5.8
5.9
-0.0068
0.0270
0.0599
0.0917
0.1220
-0.3414
-0.3343
-0.3241
-0.3110
-0.2951
8.5
8.6
8.7
8.8
8.9
0.0419
0.0146
-0.0125
-0.0392
-0.0653
0.2731
0.2728
0.2697
0.2641
0.2559
1.3
-
hex)
I
x
1 1 (x)
10(x)
I
fo(x) ~ 0 for x = 2.40483. 5.52008, 8.65373, 11.7915, 14.9309, 1~.0711, 21.2116, 24.3525. 27.4935, 30.6346
11(x) = 0 for x = 3.83171, 7.01559,10.1735, 13.3237. 16.4706. 19.6159,22.7601,25.9037,29.0468.32.1897
A94
A95
APP. 5 Tables
(continued)
Table A1
-
x
:I
-
I
I
0.0
0.5
Y1 (xj
YolX}
I
:~ I
2.0
(-x)
(-x)
-0.445
0.088
0.382
0.510
-1.471
-0.781
-0.412
-0.107
Table Al
x
Yo\.l:)
Yl~\')
x
YoIX)
2.5
3.0
3.5
4.0
4.5
0.498
0.377
0.189
-0.017
-0.195
0.146
0.325
0.410
0.398
0.301
5.0
5.5
6.0
6.5
7.0
-0.309
-0.339
-0.288
-0.173
-0.026
Yl~\")
I
0.148
-0.024
-0.175
-0.274
-0.303
I
I
Gamma Function [see (24) in App. A3.1]
a
f(a)
a
na)
a
r(a)
a
na)
Q'
f(a)
1.00
1.000000
1.20
0.911l169
1.40
0.1l1l7264
1.60
0.1l93515
1.1l0
0.93131l4
1.02
1.04
1.06
LOll
0.988844
0.978438
0.968744
0.959725
1.22
1.24
1.26
1.28
0.913106
0.908521
0.904397
0.900718
1.42
1.44
1.46
1048
0.886356
0.885805
0.885604
0.885747
1.62
1.64
1.66
1.68
0.895924
0.898642
0.901668
0.905001
1.82
1.84
1.86
1.88
0.936845
0.942612
0.948687
0.955071
1.10
0.951351
1.30
0.897471
1.50
0.886227
1.70
0.908639
1.90
0.961766
1.12
1.14
1.16
1.18
0.943590
0.936416
0.929803
0.923728
1.32
1.34
1.36
1.38
0.894640
0.892216
0.890185
0.888537
1.52
1.54
1.56
1.58
0.887039
0.888178
0.889639
0.891420
1.72
1.74
1.76
1.78
0.912581
0.916826
0.921375
0.926227
1.92
1.94
1.96
1.98
I 0.976099
0.983743
1.20
0.918 169
1.40
0.887264
1.60
0.893515
1.80
0.931 384
2.00 11.000 000
:
I
I
I
'-
Table Al
0.968774
0.991 708
Factorial Function and Its Logarithm with Base 10
I
I
II
11!
log (II!)
11
11!
log (II!)
11
11'
log VI!)
1
2
3
4
5
1
2
6
24
120
u.uuuOOO
0.301 030
0.778 151
1.380211
2.079 181
6
7
8
9
10
72u
5040
4032u
362880
3628800
2.857332
3.702431
4.605521
5.559763
6.559763
11
12
13
14
15
39916800
479001 600
6227 u2u IlUO
87178291200
1 307674368000
7.6Ul 156
8.680337
9.794280
10.940408
12.116500
I
Table A4
I
erfx
Silxj
CIIXI
x
erfx
Si(xi
ci(x)
U.U
u.uuuu
u.uuuu
:JO
2.u
0.9953
1.6054
-0.4230
0.2
0.4
0.6
0.8
1.0
0.2227
0.4284
0.6039
0.7421
0.8427
0.1996
0.3965
0.5881
0.7721
0.9461
1.0422
0.3788
0.0223
-0.1983
-0.3374
2.2
2.4
2.6
2.8
3.0
0.9981
0.9993
0.9998
0.9999
1.0000
1.6876
1.7525
1.8004
1.8321
1.8487
-0.3751
-0.3173
-0.2533
-0.1865
-0.1196
1.2
0.9103
0.9523
0.9763
0.9891
0.9953
1.1080
1.2562
1.3892
1.5058
1.6054
-0.4205
-0.4620
-0.4717
-0.4568
-0.4230
3.2
3.4
3.6
3.8
4.0
1.0000
1.0000
1.0000
1.0000
1.0000
1.8514
1.8419
1.8219
1.7934
1.7582
-0.0553
0.0045
0.0580
0.1038
0.1410
I
I
1.8
2.0
I
Error Function, Sine and Cosine Integrals [see (35), (40), (42) in App. A3.1]
l:
104
1.6
I
II
A96
APP. 5
Tables
Table AS Binomial Distribution
Probability function f(x) [see (2), Sec. 24.7] and distribution function F(x)
p
-
"
x
= 0.1
I(~F(X)_
o.
p
-
J(x)
-
o.
= 0.2
F(x)
r--
P
-
= 0.3
F(x)
I(x)
- - -I - -
o.
p
I(x)
= 0.4
P = 0.5
F(x)
I(x)
o.
I
F(x)
o.
0
I
9000
1000
0.9000
1.0000
8000
2000
0.8000
1.0000
7000
3000
0.7000
1.0000
6000
4000
0.6000
1.0000
5000
5000
0.5000
1.0000
'1
0
I
2
!HOO
1800
0100
OJHOO
0.9900
1.0000
6400
3200
0400
0.6400
0.9600
1.0000
4'J00
4200
0900
OA'JOO
0.9100
1.0000
3600
4800
1600
0.3600
0.8400
1.0000
2500
5000
2500
0.2500
0.7500
1.0000
7290
2430
0270
0010
0.7290
0.9720
0.9990
1.0000
5120
3840
0960
0080
0.5120
3
0
1
2
3
0.9920
1.0000
3430
4410
I!NO
0270
0.3430
0.7840
0.9730
1.0000
2160
4320
2880
0640
0.2160
0.6480
0.9360
1.0000
1250
3750
3750
1250
0.1250
0.5000
0.8750
1.0000
6561
2916
0486
0036
0001
0.6561
0.9477
0.9963
0.9999
1.0000
40Y6
0.4096
0.8192
0.9728
0.9984
1.0000
2401
4116
2646
0756
0081
0.2401
0.6517
0.9163
0.9919
1.0000
12'J6
0.12'J6
4
0
I
2
3
4
3456
3456
1536
0256
0.4752
0.8208
0.9744
1.0000
0625
2500
3750
2500
0625
0.0625
0.3125
0.6875
0.9375
1.0000
0
5'}u5
3281
2
3
4
5
0081
0005
0000
u.5905
0.9185
0.9'J14
0.9995
1.0000
1.0000
2048
0512
0064
0003
u.3277
0.7373
0.9421
0.9933
0.9997
1.0000
Ib81
3602
3087
1323
0284
0024
u.lb81
0.5282
0.8369
0.9692
0.9976
1.0000
lJ778
I
3456
2304
0768
0102
(J.l)778
0.3370
0.6826
0.9130
0.9898
1.0000
lJ313
1563
3125
3125
1563
0313
0.u313
0.1875
0.5000
0.8125
0.9688
1.0000
0
I
2
3
4
5
6
5314
3543
0984
0146
0012
0001
0000
0.2621
0.6554
0.9011
0.'J830
0.9984
0.9999
1.0000
1176
3025
3241
1852
0595
0102
0007
0.1176
0.4202
0.7443
0.92'J5
1.0000
1.0000
2621
3932
2458
0819
0154
0015
0001
0467
1866
3110
2765
1382
0369
0041
0.0467
0.2333
0.5443
0.8208
0.9590
0.9959
1.0000
0156
0938
2344
3125
2344
0938
0156
0.0156
0.1094
0.3438
0.6563
0.8906
0.9844
1.0000
0
I
2
3
4
5
6
7
4783
3720
1240
0230
0026
0002
0000
0000
0.4783
0.8503
0.9743
0.9973
0.9998
1.0000
1.0000
1.0000
2097
3670
2753
1147
0287
0043
0004
0000
0.2097
0.5767
0.8520
0.'J667
0.9'J53
0824
2471
3177
2269
0972
0250
0036
0002
0.0824
0.3294
0.6471
0.8740
0.9712
0.9962
0.9998
1.0000
0280
1306
2613
2903
0.0280
0.1586
0.4199
0.7102
0.9037
0.9812
0.9984
1.0000
0078
0547
1641
2734
2734
1641
0547
0078
0.0078
0.0625
0.2266
0.5000
0.7734
0.9375
0.9922
1.0000
U
I
2
3
4
:;
6
7
8
4305
3826
1488
0331
0046
0004
0000
0000
0000
U.4305
0.8131
0.9619
1678
3355
0.1678
0.5033
0576
1977
2965
2541
1361
0467
0100
0012
0001
U.0576
0.2553
0.5518
0.8059
0.9420
0.9887
0.9987
0.9999
1.0000
0168
0896
2090
2787
2322
1239
0413
0079
0007
U.0168
0.1064
0.3154
0.5941
0.8263
0.'J502
0.9915
om'J
0313
1094
2188
2734
2188
1O'J4
0313
0039
o.um'J
0.0352
0.1445
0.3633
0.6367
0.8555
0.9648
0.9961
1.0000
I'
-
4096
1536
0256
0016
0.8Y60
-
I
5
072'J
3277
40Y6
25Y2
-
6
I
I
7
8
0.5314
0.8857
0.9841
0.9987
0.99'J'J
0.'J'J50
0.9996
1.0000
1.0000
1.0000
1.0000
0.9996
I.oooo
1.0000
2'J36
0.7Y69
1468
0459
0092
0011
0001
0000
0.9437
0.98'J6
0.9988
0.9999
1.0OOO
1.0000
0.9891
0.9993
1.0000
1'J35
0774
0172
0016
0.9'J93
1.0000
I
APP.5
A97
Tables
Table A6
Poisson Distribution
Probability function f(x) [see (5), Sec. 24.7] and distribution function F(x)
ix
f.L
I(x)
= 0.1
F(x)
o.
f.L
I(x)
= 0.2
F(x)
f.L
I(x)
= 0.3
F(x)
o.
0
9048
0.9048
O.
8187
O.ll 1117
7408
0.74011
1
2
3
4
5
0905
0045
0002
0000
0.9953
0.9998
1.0000
1.0000
1637
0164
0011
0001
0.9825
0.9989
0.9999
1.0000
2222
0333
0033
0003
0.9631
0.9964
0.9997
1.0000
x
I(~)
0
I
f.L
= 0.6
f.L
= 0.8
F(x)
I(x)
F(x)
5488
0.5488
O.
4966
0.4966
O.
4493
0.4493
3293
0988
0198
0030
0004
0.8781
0.9769
0.9966
0.9996
1.0000
3476
1217
0284
0050
0007
0.1l442
0.9659
0.9942
0.9992
0.9999
3595
1438
0383
0077
0012
0001
1.0000
0002
6
7
x
= 0.7
J(x)
o.
2
3
4
5
f.L
F(x)
f.L = 1.5
F(x)
I(r)
II.
I(x)
f.L=2
F(x)
II.
I(x)
f.L
I(x)
O.
6703
2681
0536
oon
0007
0001
f.L
I(x)
= 0.4
F(x)
f.L
I(x)
0.6703
O.
6065
0.9384
0.9921
0.9992
0.9999
1.0000
3033
0758
0126
0016
0002
= 0.9
= 0.5
F(x)
0.6065
0.9098
0.9856
0.9982
0.9998
1.0000
I(x}
4066
0.4066
3679
0.3679
0.8088
0.9526
0.9909
0.9986
0.9998
3659
1647
0494
0111
0020
0.7725
0.9371
0.9865
0.9977
0.9997
3679
1839
0613
0153
0031
0.7358
0.9197
0.9810
0.9963
0.9994
1.0000
0003
1.0000
0005
0001
0.9999
1.0000
f.L=3
F(x)
I(x}
o.
f.L=4
F(x)
o.
II.
I(x)
f.L=5
F(x)
II.
0
2231
0.2231
1353
0.1353
0498
0.0498
0183
0.0183
0067
0.0067
I
3347
2510
1255
0471
0141
0.5578
0.8088
0.9344
0.9814
0.9955
2707
2707
1804
0902
0361
0.4060
0.6767
0.8571
0.9473
0.9834
1494
2240
2240
1680
1O01l
0.1991
0.4232
0.6472
0.8153
0.9161
0733
1465
1954
1954
1563
0.0916
0.2381
0.4335
0.6288
0.7851
0337
0842
1404
1755
1755
0.0404
0.1247
0.2650
0.4405
0.6160
0035
0008
0001
0.9991
0.9998
1.0000
0120
0034
0009
0002
0.9955
0.9989
0.9998
1.0000
0504
0216
0081
0027
0008
0.9665
0.9881
0.9962
0.9989
0.9997
1042
0595
0298
0132
0053
0.8893
0.9489
0.9786
0.9919
0.9972
1462
1044
0653
0363
0181
0.7622
0.8666
0.9319
0.9682
0.9863
0002
0001
0.9999
1.0000
0019
0006
0.9991
0.9997
0082
0034
0.9945
0.9980
13
0002
0.9999
0013
0.9993
14
15
0001
1.0000
0005
0002
0.9998
0.9999
0000
1.0000
2
3
4
5
6
7
8
9
10
II
12
16
I
f.L=1
F(x)
F(x)
o.
I
I
A98
APP. 5 Tables
Table A7 Normal Distribution
Values of the distribution function <1>(.::) [see (3). Sec. 24.81. <1>( -~) = 1 - <1>(z)
z
<1>(;:)
5040
5080
5120
5160
5199
0.51
0.52
0.53
0.54
0.55
0.06
0.07
0.08
0.09
0.10
5239
5279
5319
5359
5398
0.11
0.12
0.13
0.14
0.15
z
<1>(~)
0.01
0.02
0.03
0.04
0.05
z
<1>(~)
z
<1>(~)
6950
6985
7019
7054
7088
loOI
lo02
1.03
1.04
1.05
9778
9783
9788
9793
9798
2.51
2.52
2.53
2.54
2.55
9940
9941
9943
9945
9946
0.56
0.57
0.58
0.59
0.60
7123
7157
7190
7224
7257
2.06
2.07
2.08
2.09
2.10
9R03
9808
9812
9817
9821
2.5ti
2.57
2.58
2.59
2.60
9948
9949
9951
9952
9953
5438
5478
5517
5557
5596
0.61
0.62
0.63
0.64
0.65
94ti3
9474
9484
9495
9505
HI
2.12
2.13
2.14
2.15
982ti
9830
9834
9838
9842
2.61
2.62
2.63
2.64
2.65
9955
9956
9957
9959
9960
0.16
0.17
0.18
0.19
0.20
5636
5675
5714
5753
5793
1.66
1.67
1.68
1.69
1.70
9515
9525
9535
9545
9554
2.16
2.17
2.18
2.19
2.20
9846
9850
9854
9857
9861
2.66
2.67
2.68
2.69
2.70
9961
9962
9963
9964
9965
0.21
0.22
0.23
0.24
0.25
8869
8888
8907
8925
8944
1.71
1.72
1.73
1.74
1.75
9564
9573
9582
9591
9599
2.21
2.22
2.23
2.24
2.25
9864
9868
9871
9875
9878
2.71
2.72
2.73
2.74
2.75
9966
9967
9968
9969
9<)70
1.26
1.27
1.28
1.29
1.30
8962
8980
8997
9015
9032
1.76
1.77
1.78
1.79
1.80
9608
9616
9625
9633
9641
2.26
2.27
2.28
2.29
2.30
9RRI
9884
9887
98<)0
9893
2.76
2.77
2.78
2.79
2.80
9971
9972
9973
9974
9974
7910
7939
7967
7995
8023
1.31
1.32
1.33
1.34
1.35
9049
9066
9082
9099
9115
1.81
1.82
lo83
1.84
1.85
9649
9656
9664
9671
9678
2.31
2.32
2.33
2.34
2.35
9896
9898
9901
9904
9906
2.81
2.82
2.83
2.84
2.85
9975
9976
9977
9977
9<)78
0.86
0.87
0.88
0.89
0.90
8051
8078
8106
8133
8159
1.36
1.37
1.38
1.39
lAO
9131
<)147
9162
9177
9192
1.86
1.87
1.88
1.89
1.90
9686
9693
9699
9706
9713
2.36
2.37
2.38
2.39
2AO
9909
9911
9913
9916
9918
2.86
2.87
2.88
2.89
2.90
9979
9979
9980
9981
9981
6591
6628
6664
6700
6736
0.91
0.92
0.93
0.94
0.95
8186
8212
8238
8264
8289
1.41
1.42
1.43
1.44
lA5
9207
9222
9236
9251
9265
1.91
1.92
1.93
1.94
1.95
9719
9726
9732
9738
9744
2.41
2A2
2.43
2.-14
2A5
9920
9922
9925
9927
<)929
2.91
2.92
2.93
2.94
2.95
9982
9982
9983
9984
9984
6772
6808
6844
6879
6915
0.96
0.97
0.98
0.99
1.00
8315
8340
8365
8389
8413
1.46
1.47
1.48
1.49
1.50
9279
<)292
9306
9319
9332
1.96
1.97
1.98
1.99
2.00
9750
9756
9761
9767
9772
2A6
2A7
2A8
2.49
2.50
9931
9932
9934
9936
9938
2.96
2.97
2.98
2.99
3.00
9985
9<)85
9986
9986
9987
~
<P(~)
<.
<1>(~)
8438
8461
8485
8508
8531
1.51
1.52
1.53
1.54
1.55
9345
9357
9370
9382
9394
2.01
2.02
2.03
2.04
2.05
1.06
1.07
1.08
1.09
1.10
8554
8577
8599
8621
8643
1.56
1.57
1.58
1.59
1.60
9406
9418
9429
9441
9452
7291
7324
7357
7389
7422
1.11
1.12
1.13
1.14
1.15
8665
8686
8708
8729
8749
1.61
1.62
1.63
1.64
1.65
0.66
0.67
0.68
0.69
0.70
7454
7486
7517
7549
7580
1.16
1.17
1.18
1.19
1.20
8770
8790
8810
8830
8849
5832
5871
5910
5948
5987
0.71
0.72
0.73
0.74
0.75
7611
7642
7ti73
7704
7734
1.21
1.22
1.23
1.24
1.25
0.26
0.27
0.28
0.29
0.30
6026
6064
6103
6141
6179
0.76
0.77
0.78
0.79
0.80
7764
7794
7823
7852
7881
0.31
0.32
0.33
0.34
0.35
6217
6255
6293
6331
ti3tiR
0.81
0.82
0.83
0.84
0.85
0.36
0.37
0.38
0.39
0.40
6406
6443
6480
6517
6554
OAI
OA2
OA3
0.44
OA5
OA6
OA7
0.48
OA9
I 0.50
u.
u.
u.
u.
I
-
u.
u.
A99
APP. 5 Tables
Table AS Normal Distribution
Values of z for given values of cI>(z) [see (3). Sec. 24.8] and D(-;.)
Example: -;. = 0.279 if cI>(-;.) = 61 %; z = 0.860 if D(-;.) = 61 %.
%
~(<1»
z(D)
%
z(<1»
cI>(-;.) -
cI>( - z)
~(D)
%
:(<1»
:(D)
1
2
3
4
5
-2.326
-2.054
-1.881
-1.751
-1.645
0.013
0.025
0.038
0.050
0.063
41
42
43
44
45
-0.22S
-0.202
-0.176
-0.151
-0.126
0.539
0.553
0.568
0.583
0.598
SI
82
83
84
85
0.S7S
0.915
0.954
0.994
1.036
1.311
1.341
1.372
1.405
1.440
6
7
8
9
10
-1.555
-1.476
-1.405
-1.341
-1.282
0.075
0.088
0.100
0.113
0.126
46
47
48
49
50
-0.100
-0.075
-0.050
-0.025
0.000
0.613
0.628
0.643
0.659
0.674
86
S7
88
89
90
1.080
1.126
1.175
1.227
1.282
1.476
1.514
1.555
1.598
1.645
11
12
13
14
15
-1.227
-1.175
-1.126
-1.080
-1.036
0.138
0.151
0.164
0.176
0.IS9
51
52
53
54
55
0.025
0.050
0.075
0.100
0.126
0.690
0.706
0.722
0.739
0.755
91
92
93
94
95
1.341
1.405
1.476
1.555
1.645
1.695
1.751
1.812
1.881
1.960
16
17
18
19
20
-0.994
-0.954
-0.915
-0.878
-0.842
0.202
0.215
0.228
0.240
0.253
56
57
58
59
60
0.151
0.176
0.202
0.228
0.253
0.772
0.789
0.806
0.824
0.842
96
97
97.5
98
99
1.751
1.881
1.9{)()
2.054
2.326
2.054
2.170
2.241
2.326
2.576
21
22
23
24
25
-0.806
-0.772
-0.739
-0.706
-0.674
0.266
0.279
0.292
0.305
0.319
61
62
63
64
65
0.279
0.305
0.332
0.358
0.385
0.860
0.878
0.896
0.915
0.935
99.1
99.2
99.3
99.4
99.5
2.366
2.409
2.457
2.512
2.576
2.612
2.652
2.697
2.748
2.807
26
27
2R
29
30
- 0.643
- 0.613
-0.5S3
-0.553
-0.524
0.332
0.345
0.358
0.372
0.385
66
67
68
69
70
0.412
0.-140
0.468
0.496
0.524
0.954
0.974
0.994
1.015
1.036
99.6
99.7
99.8
99.9
2.652
2.748
2.878
3.090
2.878
2.968
3.090
3.291
31
32
33
34
35
-0.496
-0.468
-0.-140
-0.412
-0.385
0.399
0.412
0.-1-26
0.440
0.454
71
72
73
74
75
0.553
0.583
0.613
0.643
0.674
1.058
1.080
1.103
1.126
1.150
99.91
99.92
99.93
99.94
99.95
3.121
3.156
3.195
3.239
3.291
3.320
3.353
3.390
3.432
3.481
36
37
38
39
40
-0.358
-0.332
-0.305
-0.279
-0.253
0.468
0.482
0.496
0.510
0.524
76
77
78
79
80
0.706
0.739
0.772
O.S06
0.842
1.175
1.200
1.227
1.254
1.282
99.96
99.97
99.98
99.99
3.353
3.432
3.540
3.719
3.540
3.615
3.719
3.891
- --
,
A100
APP. 5
Tables
Table A9 t-Distribution
Values of z for given values of the distribution function F(z) (see (8) in Sec. 25.3).
Example: For 9 degrees of freedom, z = 1.83 when F(z) = 0.95.
Number of Degrees of Freedom
F(:)
2
3
4
5
6
7
8
9
10
0.00
0.32
0.73
1.38
3.08
0.00
0.29
0.62
1.06
1.89
0.00
0.28
0.58
0.98
1.64
0.00
0.27
0.57
0.94
1.53
0.00
0.27
0.56
0.92
1.48
0.00
0.26
0.55
0.91
1.44
0.00
0.26
0.55
0.90
1.41
0.00
0.26
0.55
0.89
1.40
0.00
0.26
0.54
0.88
1.38
0.00
0.26
0.54
0.88
1.37
6.31
12.7
31.8
63.7
318.3
2.92
4.30
6.96
9.92
22.3
2.35
3.18
4.54
5.84
10.2
2.13
2.78
3.75
4.60
7.17
2.02
2.57
3.36
4.03
5.89
1.94
2.45
3.14
3.71
5.21
1.89
2.36
3.00
3.50
4.79
1.86
2.31
2.90
3.36
4.50
1.83
2.26
2.82
3.25
4.30
1.81
2.23
2.76
3.17
4.14
19
20
L
0.5
0.6
0.7
0.8
0.9
0.95
0.975
I
0.99
0.995
0.999
l
Number of Degrees of Freedom
F(~)
II
12
0.00
0.26
0.54
0.88
1.36
U.OO
0.6
0.7
0.8
0.9
0.26
0.54
0.87
1.36
0.95
0.975
0.99
0.995
0.999
1.80
2.20
2.72
3.11
4.02
1.78
2.18
2.68
3.05
3.93
1.77
2.16
2.65
3.01
3.85
U.S
I
13
14
15
17
18
U.OO
U.OO
!l.OO
0.26
0.54
0.87
1.35
0.26
0.54
0.87
1.35
0.26
0.54
0.87
1.34
O.UU
U.OU
0.26
0.54
0.86
1.34
0.26
0.53
0.86
1.33
0.00
0.26
0.53
0.86
1.33
1.76
2.14
2.62
2.98
3.79
1.75
2.13
2.60
2.95
3.73
1.75
2.12
2.58
2.92
3.69
1.74
2.11
2.57
2.90
3.65
16
II
U.OO
U.UU
0.26
0.53
0.86
1.33
0.26
0.53
0.86
1.33
1.73
2.10
2.55
2.88
3.61
1.73
2.09
2.54
2.86
3.58
1.72
2.09
2.53
2.85
3.55
I
Number of Degrees of Freedom
F(:)
22
24
26
28
30
40
50
100
200
ex:
0.5
0.6
0.7
0.8
0.9
0.00
0.26
0.53
0.86
1.32
0.00
0.26
0.53
0.86
1.32
0.00
0.26
0.53
0.86
1.31
0.00
0.26
0.53
0.85
1.31
0.00
0.26
0.53
0.85
1.31
0.00
0.26
0.53
0.85
1.30
0.00
0.25
0.53
0.85
1.30
0.00
0.25
0.53
0.85
1.29
0.00
0.25
0.53
0.84
1.29
0.00
0.25
0.52
0.84
1.28
0.95
0.975
0.99
1.72
2.07
2.51
2.82
3.50
1.71
2.06
2.49
2.80
3.47
1.71
2.06
2.48
2.78
3.43
1.70
2.05
2.47
2.76
3.41
1.70
2.04
2.46
2.75
3.39
1.68
2.02
2.42
2.70
3.31
1.68
2.01
2.40
2.68
3.26
1.66
1.98
2.36
2.63
3.17
1.65
1.97
2.35
2.60
3.13
1.65
1.96
2.33
2.58
3.09
I 0.995
0.999
I
APP. 5
A10l
Tables
Table A10 Chi-square Distribution
Values of x for given values of the distribution function F(z) (see Sec. 25.3 before (17)).
Example: For 3 degrees of freedom, z = 1l.34 when F(z.) = 0.99.
Number of Degrees of Freedom
F(z)
1
2
3
0.005
0.01
0.025
0.05
0.00
0.00
0.00
0.00
0.01
0.02
0.05
0.10
0.07
0.11
0.22
0.35
0.95
0.975
0.99
0.995
3.84
5.02
6.63
7.88
5.99
7.38
9.21
10.60
7.81
9.35
11.34
12.84
4
5
6
0.21
0.30
0.48
0.71
0.41
0.55
0.83
US
0.68
0.87
1.24
1.64
9.49
11.14
13.28
14.86
11.07
12.83
15.09
16.75
12.59
14.45
16.81
18.55
7
8
- 0.99
1.34
1.24
1.65
2.18
1.69
2.17
"2.73
14.07
16.01
15.51
17.53
20.09
21.95
18A8
20.28
9
10
1.73
2.09
2.70
3.33
2.16
2.56
3.25
3.94
16.92
19.02
21.67
23.59
18.31
20.48
23.21
25.19
-
Number of Degrees of Freedom
F(z)
11
12
13
14
IS
16
17
18
19
20
0.005
0.01
0.025
0.05
2.60
3.05
3.82
4.57
3.07
3.57
4.40
5.23
3.57
4.11
5.01
5.89
4.07
4.66
5.63
6.57
4.60
5.23
6.26
7.26
5.14
5.81
6.91
7.96
5.70
6.41
7.56
8.67
6.26
7.01
8.23
9.39
6.84
7.63
8.91
10.12
7.43
8.26
9.59
10.85
0.95
0.975
0.99
0.995
19.68
21.92
24.72
26.76
21.03
23.34
26.22
28.30
22.36
24.74
27.69
29.82
23.68
26.12
29.14
31.32
25.00
27.49
30.58
32.80
26.30
28.85
32.00
34.27
27.59
30.19
33.41
35.72
28.87
31.53
34.81
37.16
30.14
32.85
36.19
38.58
31.41
34.17
37.57
40.00
--
-'--
~
--
-~
-~
Number of Degrees of Freedom
F(::J
21
22
23
24
25
26
27
28
29
30
0.005
0.01
0.0"25
0.05
8.0
8.9
10.3
11.6
8.6
9.5
11.0
12.3
9.3
10.2
11.7
13.1
9.9
10.9
12.4
13.8
10.5
11.5
13.1
14.6
11.2
12.2
13.8
15.4
11.8
12.9
14.6
16.2
12.5
13.6
15.3
16.9
13.1
14.3
16.0
17.7
13.8
15.0
16.8
18.5
0.95
0.975
0.99
0.995
32.7
35.5
38.9
41..1-
33.9
36.8
40.3
42.8
35.2
38.1
41.6
44.2
36.4
39.4
43.0
45.6
37.7
40.6
44.3
46.9
38.9
41.9
45.6
48.3
40.1
43.2
47.0
49.6
41.3
44.5
48.3
51.0
42.6
45.7
49.6
52.3
43.8
47.0
50.9
53.7
Number of Degrees of Freedom
F(z)
> 100 (Approximation)
40
50
60
70
80
90
100
0.005
0.01
0.025
0.05
20.7
22.2
24.4
26.5
28.0
29.7
32.4
34.8
35.5
37.5
40.5
43.2
43.3
45.4
48.8
51.7
51."2
53.5
57.2
60.4
59."2
61.8
65.6
69.1
67.3
70.1
74.2
77.9
!(h
!(h
!(h
!(h
-
0.95
0.975
0.99
L 0.995
55.8
59.3
63.7
66.8
67.5
71.4
76.2
79.5
79.1
83.3
88.4
92.0
90.5
95.0
100.4
104.2
101.9
106.6
112.3
116.3
113.1
118.1
124.1
1"28.3
124.3
129.6
135.8
140.2
!(h
!(h
+ 1.(4)2
+ 1.96)2
+ "2.33)2
+ 2.58)2
--
In the last column, h
=
~, where
III IS
the nUlllbel of degree~ of freedom.
~(h
WI
2.58)2
2.33)2
1.96)2
1.64)2
Al02
APP.5
Tables
Table All F-Distribution with (m, n) Degrees of Freedom
Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value
Example: For (7, 4) d.f., .::: = 6.09 if F(.:::) = 0.95.
111=1
II
111=2
111=3
111=4
-
= 5
111=6
111 = 7
111=8
111=9
230
19.3
9.01
6.26
5.05
234
19.3
S.94
6.16
4.95
237
19.4
8.89
6.09
4.88
239
19.4
8.85
6.04
4.82
241
19.4
8.81
6.00
4.77
III
I
2
3
4
5
161
18.5
10.1
7.71
6.61
200
19.0
9.55
6.94
5.79
216
19.2
9.2S
6.59
5.41
225
19.2
9.12
6.39
5.19
6
7
8
9
10
5.99
5.59
5.32
5.12
4.%
5.14
4.74
4.46
4.26
4.10
4.76
4.35
4.07
3.86
3.71
4.53
4.12
3.84
3.63
3.4R
4.39
3.97
3.69
3A8
3.33
4.28
3.87
3.5S
3.37
3.22
4.21
3.79
3.50
3.29
3.14
4.15
3.73
3.44
3.23
3.07
4.10
3.68
3.39
3.18
3.02
II
12
13
14
15
4.84
4.75
4.67
4.60
4.54
3.98
3.89
3.SI
3.74
3.68
3.59
3.49
3.41
3.34
3.29
3.36
3.26
3.18
3.11
3.06
3.20
3.11
3.03
2.96
2.90
3.09
3.00
2.92
2.85
2.79
3.01
2.91
2.83
2.76
2.71
2.95
2.85
2.77
2.70
2.64
2.90
2.80
2.71
2.65
2.59
16
17
18
19
20
4.49
4.45
4.41
4.38
4.35
3.63
3.59
3.55
3.52
3.49
3.24
3.20
3.16
3.13
3.10
3.01
2.96
2.93
2.90
2.87
2.85
2.81
2.77
2.74
2.71
2.74
2.70
2.66
2.63
2.60
2.66
2.61
2.5S
2.54
2.51
2.59
2.55
2.51
2.48
2.45
2.54
2.49
2.46
2.42
2.39
22
24
26
28
30
4.30
4.26
4.23
4.20
4.17
3.44
3.40
3.37
3.34
3.32
3.05
3.01
2.98
2.95
2.92
2.82
2.78
2.74
2.71
2.69
2.66
2.62
2.59
2.56
2.53
2.55
2.51
2.47
2.45
2A2
2.46
2.42
2.39
2.36
2.33
2.40
2.36
2.32
2.29
2.27
2.34
2.30
2.27
2.24
2.21
32
34
36
38
40
4.15
4.13
4.11
4.10
4.08
3.29
3.28
3.26
3.24
3.23
2.90
2.88
2.87
2.85
2.84
2.67
2.65
2.63
2.62
2.61
2.51
2.49
2.48
2.46
2.45
2.40
2.38
2.36
2.35
2.34
2.31
2.29
2.28
2.26
2.25
2.24
2.23
2.21
2.19
2.18
2.19
2.17
2.15
2.14
2.12
50
60
70
80
90
4.03
4.00
3.98
3.96
3.95
3.18
3.15
3.13
3.11
3.10
2.79
2.76
2.74
2.72
2.71
2.56
2.53
2.50
2.49
2.47
2.40
2.37
2.35
2.33
2.32
2.29
2.25
2.23
2.21
2.20
2.20
2.17
2.14
2.13
2.11
2.13
2.10
2.07
2.06
2.04
2.07
2.04
2.02
2.00
1.99
100
150
200
IOUO
3.94
3.90
3.89
3.85
3.84
3.09
3.06
3.04
3.00
3.00
2.70
2.66
2.65
2.61
2.60
2.46
2.43
2.42
2.38
2.37
2.31
2.27
2.26
2.22
2.21
2.19
2.16
2.14
2.11
2.10
2.10
2.07
2.06
2.02
2.01
2.03
2.00
1.98
1.95
1.94
1.97
1.94
1.93
1.89
1.88
I
:x:
0.95
APP. 5
A103
Tables
Table All F-Distribution with (m, n) Degrees of Freedom (continued)
Values of z for which the distribution function F(z) [see (13), Sec. 25.4] has the value
111=10
n
111
= 15
III
= 20
111
= 30
111
= 40
III
= 50
III
= 100
0.95
00
2
3
4
5
242
19.4
8.79
5.96
4.74
246
19.4
8.70
5.86
4.62
248
19.4
8.66
5.80
4.56
250
19.5
8.62
5.75
4.50
251
19.5
8.59
5.72
4.46
252
19.5
8.58
5.70
4.44
253
19.5
8.55
5.66
4.41
254
19.5
8.53
5.63
4.37
6
7
8
9
10
4.06
3.64
3.35
3.14
2.98
3.94
3.51
3.22
3.01
2.85
3.87
3A4
3.15
2.94
2.77
3.81
3.38
3.08
2.86
2.70
3.77
3.34
3.04
2.83
2.66
3.75
3.32
3.02
2.80
2.64
3.71
3.27
2.97
2.76
2.59
3.67
3.23
2.93
2.71
2.54
11
12
13
14
15
2.85
2.75
2.67
2.60
2.54
2.72
2.62
2.53
2.46
2.40
2.65
2.54
2.46
2.39
2.33
2.57
2.47
2.38
2.31
2.25
2.53
2.43
2.34
2.27
2.20
2.51
2.40
2.31
2.24
2.18
2.46
2.35
2.26
2.19
2.12
2.40
2.30
2.21
2.13
2.07
16
17
18
19
20
2.49
2.45
2.41
2.38
2.35
2.35
2.31
2.27
2.23
2.20
2.28
2.23
2.19
2.16
2.12
2.19
2.15
2.11
2.07
2.04
2.15
2.10
2.06
2.03
1.99
2.12
2.08
2.04
2.00
1.97
2.07
2.02
1.98
1.94
1.91
2.01
1.96
1.92
1.88
1.84
22
24
26
28
30
2.30
2.25
2.22
2.19
2.16
2.15
2.11
2.07
2.04
2.01
2.07
2.03
1.99
1.96
1.93
1.98
1.94
1.90
1.87
1.84
1.94
1.89
1.85
1.82
1.79
1.91
1.86
1.82
1.79
1.76
1.85
1.80
1.76
1.73
1.70
1.78
1.73
1.69
1.65
1.62
32
34
36
38
40
2.14
2.12
2.11
2.09
2.08
1.99
1.97
1.95
1.94
1.92
1.91
1.89
1.87
1.85
1.84
1.82
1.80
1.78
1.76
1.74
1.77
1.75
1.73
1.71
1.69
1.74
1.71
1.69
1.68
1.66
1.67
1.65
1.62
1.61
1.59
1.59
1.57
1.55
1.53
1.51
50
60
70
80
90
2.03
1.99
1.97
1.95
1.94
un
1.84
1.81
1.79
1.78
1.78
1.75
1.72
1.70
1.69
1.69
1.65
1.62
1.60
1.59
1.63
1.59
1.57
1.54
1.53
1.60
1.56
1.53
1.51
1.49
1.52
1.48
1.45
1.43
1.41
1.44
1.39
1.35
1.32
1.30
100
150
200
1000
1.93
1.89
1.88
1.84
1.83
1.77
1.73
1.72
1.68
1.67
1.68
1.64
1.62
1.58
1.57
1.57
1.54
1.52
1.47
1.46
1.52
1.48
1.46
1.41
1.39
1.48
1.44
1.41
1.36
1.35
1.39
1.34
1.32
1.26
1.24
1.28
1.22
1.19
1.08
1.00
I
cc
I
Al04
APP. 5
Tables
Table All F-Distribution with (m, n) Degrees of Freedom (continued)
Values of:z for which the distribution function F(::.) [see (13), Sec. 25.4] has the value
0.99
m=1
111=2
m=3
111=4
111=5
111=6
111=7
111=8
111=9
1
2
3
4
5
4052
98.5
34.1
21.2
16.3
4999
99.0
30.8
18.0
13.3
5403
99.2
29.5
16.7
12.1
5625
99.2
28.7
16.0
11.4
5764
99.3
28.2
15.5
11.0
5859
99.3
27.9
15.2
10.7
5928
99.4
27.7
15.0
10.5
5981
99.4
27.5
14.8
10.3
6022
99.4
27.3
14.7
10.2
6
7
8
9
10
13.7
12.2
11.3
10.6
10.0
n
10.9
9.55
8.65
8.02
7.56
9.78
8.45
7.59
6.99
6.55
9.15
7.85
7.01
6.42
5.99
8.75
7.46
6.63
6.06
5.64
8.47
7.19
6.37
5.80
5.39
8.26
6.99
6.18
5.61
5.20
8.10
6.84
6.03
5.47
5.06
7.98
6.72
5.91
5.35
4.94
12
13
14
15
9.65
9.33
9.07
8.86
8.68
7.21
6.93
6.70
6.51
6.36
6.22
5.95
5.74
5.56
5.42
5.67
5.41
5.21
5.04
4.89
5.32
5.06
4.86
4.69
4.56
5.07
4.82
4.62
4.46
4.32
4.89
4.64
4.44
4.28
4.14
4.74
4.50
4.30
4.14
4.00
4.63
4.39
4.19
4.03
3.89
16
17
18
19
20
8.53
8.40
8.29
8.18
8.10
6.23
6.11
6.01
5.93
5.85
5.29
5.18
5.09
5.01
4.94
4.77
4.67
4.58
4.50
4.43
4.44
4.34
4.25
4.17
4.10
4.20
4.10
4.01
3.94
3.87
4.03
3.93
3.84
3.77
3.70
3.89
3.79
3.71
3.63
3.511
3.78
3.68
3.60
3.52
3.46
22
24
26
28
30
7.95
7.82
7.72
7.64
7.56
5.72
5.61
5.53
5.45
5.39
4.82
4.72
4.64
4.57
4.51
4.31
4.22
4.14
4.07
4.02
3.99
3.90
3.82
3.75
3.70
3.76
3.67
3.59
3.53
3.47
3.59
3.50
3.42
3.36
3.30
3.45
3.36
3.29
3.23
3.17
3.35
3.26
3.18
3.12
3.07
32
34
36
38
40
7.50
7.44
7.40
7.35
7.31
5.34
5.29
5.25
5.21
5.18
4.46
4.42
4.38
4.34
4.31
3.97
3.93
3.89
3.86
3.83
3.65
3.61
3.57
3.54
3.51
3.43
3.39
3.35
3.32
3.29
3.26
3.22
3.18
3.15
3.12
3.13
3.09
3.05
3.02
2.99
3.02
2.98
2.95
2.92
2.89
50
60
70
80
90
7.17
7.08
7.01
6.96
6.93
5.06
4.98
4.92
4.88
4.85
4.20
4.13
4.07
4.04
4.01
3.72
3.65
3.60
3.56
3.54
3.41
3.34
3.29
3.26
3.23
3.19
3.12
3.07
3.04
3.01
3.02
2.95
2.91
2.87
2.84
2.89
2.82
2.78
2.74
2.72
2.78
2.72
2.67
2.64
2.61
100
150
200
1000
6.90
6.81
6.76
6.66
6.63
4.82
4.75
4.71
4.63
4.61
3.98
3.91
3.88
3.80
3.78
3.51
3.45
3.41
3.34
3.32
3.21
3.14
3.11
3.04
3.m
2.99
2.92
2.89
2.82
2.80
2.82
2.76
2.73
2.66
2.64
2.69
2.63
2.60
2.53
2.51
2.59
2.53
2.50
2.43
2.41
11
'"
A10S
APP. 5 Tables
F-Distribution with (m, n) Degrees of Freedom (continued)
Values of ~ for which the distribution function F(;:) [see (13), Sec. 25.4] has the value
Table All
11
III
III
6056
99.4
27.2
14.5
10.1
I
2
3
4
5
I
= 10
= 15
III
=
20
111
= 30
111
= 40
III
= 50
III
= 100
0.99
co
6157
99.4
26.9
14.2
9.72
6209
99.4
26.7
14.0
9.55
6261
99.5
26.5
13.8
9.38
6287
99.5
26.4
13.7
9.29
6303
99.5
26.4
13.7
9.24
6334
99.5
26.2
13.6
9.13
6366
99.5
26.1
13.5
9.G:!
6
7
8
9
10
7.87
6.62
5.81
5.26
4.85
7.56
6.31
5.52
4.96
4.56
7.40
6.16
5.36
4.81
4.41
7.23
5.99
5.20
4.65
4.25
7.14
5.91
5.12
4.57
4.17
7.09
5.86
5.07
4.52
4.12
6.99
5.75
4.96
4.42
4.01
5. 65
4.86
4.31
3.91
11
12
13
14
15
4.54
4.30
4.10
3.94
3.80
4.25
4.01
3.66
3.52
4.10
3.86
3.66
3.51
3.37
3.94
3.70
3.51
3.35
3.21
3.86
3.62
3.43
3.27
3.13
3.81
3.57
3.38
3.22
3.08
3.71
3.47
3.27
3.11
2.98
3.60
3.36
3.17
3.00
2.87
16
3.69
3.59
3.51
3.43
3.37
3.31
3.23
3.15
3.09
3.26
3.16
3.08
3.00
2.94
3.10
3.00
2.92
2.84
2.78
3.02
2.92
2.84
2.76
2.69
2.97
2.87
2.78
2.71
2.64
2.86
2.76
2.68
2.60
2.54
2.75
2.65
2.57
2.49
2.42
3.26
3.17
3.09
3.03
2.98
2.98
2.89
2.81
2.75
2.70
2.83
2.74
2.66
2.60
2.55
2.67
2.58
2.50
2.44
2.39
2.58
2.49
2.42
2.35
2.30
2.53
2.44
2.36
2.30
2.25
2.42
2.33
2.25
2.19
2.13
2.31
2.21
2.13
2.06
2.01
34
36
38
40
2.93
2.89
2.86
2.83
2.80
2.65
2.61
2.58
2.55
2.52
2.50
2.46
2.43
2.40
2.37
2.34
2.30
2.26
2.23
2.20
2.25
2.21
2.18
2.14
2.11
2.20
2.16
2.12
2.09
2.06
2.08
2.04
2.00
1.97
1.94
1.96
1.91
1.87
1.84
1.80
50
60
70
80
90
2.70
2.63
2.59
2.55
2.52
2.42
2.35
2.31
2.27
2.24
2.27
2.20
2.15
2.12
2.09
2.10
2.03
1.98
1.94
1.92
2.01
1.94
1.89
1.85
1.82
1.95
1.88
1.83
1.79
1.76
1.82
1.75
1.70
1.65
1.62
1.68
1.60
1.54
1.49
1.46
100
150
200
1000
2.50
2.44
2.22
2.16
2.13
2.06
2.04
2.07
2.00
1.97
1.89
1.83
1.79
1.72
1.70
1.80
1.73
1.69
1.61
1.59
1.74
1.66
1.63
1.54
1.52
1.60
1.52
1.48
1.38
1.36
1.43
1.33
1.28
1.11
1.00
171
18
19
20
HZ
3.-l1
6.88
I
22
24
26
28
:~ I
I
I
2.-l1
2.34
2.32
JC
I
I
I
1.90
1.88
I
I
1
A106
APP. 5 Tables
Table All Distribution Function F(x)
in Section 25.8
~
I
x
=3
o
167
I
£ I
1=20
o.
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
I 67
68
69
70
71
72
73
74
75
1 76
'77
'78
179
'80
'81
82
83
84
85
86
87
88
89
90
91
92
93
94
1
001
002
002
003
004
005
006
007
008
010
012
014
017
020
023
027 1
032
037
043
049
056
064
073
082
093
104
117
130
144
159
176
193
211 I
230
250
271
293
315
339 1
362
387
411 1
436
462
487
=6
0
1
2
3
4
5
6
7
001
008
028
068
136
235
360
500
o.
008
1 1042
2 117
0421
1 167
2 , 375
x
o.
o
o
~
x
x
134 12421
408
~
x =19
,
,
I
,
11
x
1
2
3
4
5
6
001
0051
015
035
068
119
71 191
8 281
9 386
10,500
O.
,
73 411
74 441
, 75 1 -1-70
: 76 500
1
6
7
8
9
10
11
II
=8
O.
I001
0031
007
016
031
054
089
138
199
274
360
452
IZ
001
002
003
003
004
005
007
009 1
011 I
013
016
020
024
029
0341
041
048
056
066
076
088
100
115
130
147
165
64 184
65 1 205
66 227
67 250
68 275
69 300
70 327
71 354
,72 383
,
2
3
4
5
12
13
=18
38
39
40
41
'42
143
'144
45
46
'47
148
'49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
I
X ,
o. I
O.
43 001
44 002
45 002
46 003'
47 003
48 004,
49 005
50 006
51 008
52 010
53 012
54 014 1
155 017
56 021
57 025
58 029 I
59 034
I 60 040'
61 0471
62 054
163 062:
64 072 I
65 082
66 I 093 :
67 1105
68 119
69 133 1
70 149'
71 166
72 184'
73 203 ~
74 223
75 245,
76 267
1
77 290 I
78 314
79 339
1
80 3651
181 391
82 4181
83 445
84 473
85 500
I
=71
II
11
/I
=4
1 O.
o.
~
11
= P(T ~ x) of the Random Variable T
x
=17
I
I
~
/I
x
~
7
8
9
10
11
12
13
14
15
16
,17
IZ
x
=10
6 001
7 002
8 005
4 001
5 003
6' 006
O.
32 001
33 002
34 002 1
27
35 003
28 002
36 1004
29 002
37 005
30 003
38 007
31 004
39 009
32 006
40 011
33 008
41 014
34 010
42 017
35 013
43 021
44 026
361016
37 021
45 032
381026
46 1038
39 032
47 046
40 039
48 054
41 048
49 064
42,058
50 076
51 088
431070
44 083
52 102
45 097
53 118
46 114
54 135
1
47 133
55 154
1
48 153
56 174
49 175
57 196
50 199
58 220
51 1 225
59 245
52 253
60 271
53 282
61 299
54' 313
62 328
55 345
63 358
64 388' 56 378
65 420
57 412/
66 452/ 58 1 447
59 482
, / 67 484
I
rE
8 001
9 002
10 003
11 0051
12 008
13 013 1
9 0081
0121
10 014
022
038
11 023
060' 12 0361 14 020
13 054
090
15 030
16 043
130, 1 14 078
17 060
179
151108
238 I 16 146
18 082
"106' 1 171190
19 109
18 242
20 141
381
460 119 300 121 179
1
1 20 364
22 2231
'23 271
431
500 '24 324
,
I
I~~
l
x
27
=15
o.
~! ~~I
25 003
26 004
27006
28 008
29 010
30 014
31 018
32 023
£
=14
18
19
20
21
22
23
24
25,
001
002
O.
~
=';3
x
om'
O.
003
005'
007 I
010
013 ,
14
15
16
17
18
001
001
002
003
005
11
x
I
=12
O.
11 001
12 002
1
28
29
30
31
132
33
34
35
36
37
38
39
40
141
42
43
44
'45
I
~
,~! ~~~ ~~ ,~~! I ~~ ~~ I!!~!
35 046
136 057
137 070 1
38 0841
, 391101
: 401120
411141
42 164
43 190
44 218
45 248
46 279
47 313
I 48 3491
: 49 385
,50 423
51 461
'52 500
I
=11
031
21 015
15 007
116 010
040 122 , 021
23 029
17 016
051
063 124 038
18 022
25 050
19 031
079
096
26 064
20 043
117, 27 082
21 058
140
28 102
22 076
165
29 126/ 23 098
194: 30 153
24 125
225
31 1 184
25 155
259
32 218/ 26 190
295
33 255 I 27 230
334 I 34 295
28 273
374
35 338 129 319
415
36 383
30 369
457 137 4291 131 420
500 I 38 476' 32 473
PHOTO
CREDITS
Part A Opener: John SohmJChromoshohmJPhoto Researchers.
Part B Opener: Lester Lefkowitz/Corbis Images.
Part C Opener: Science Photo Library/Photo Researchers.
Part D Opener: Lester Lefkowitz/Corbis Images.
Part E Opener: Walter Hodges/Stone/Getty Images.
Chapter 19, Figure 434: From Dennis Sharp, ARCHITECKTUR. ©1973 der
deutschsprachigen Ausgabe Edition Praeger GmbH, Munchen.
Part F Opener: Charles O'Rear/Corbis Images.
Part G Opener: Greg Pease/Stone/Getty Images.
Appendix 1 Opener: Patrick Bennett/Corbis Images.
Appendix 2 Opener: Richard T. Nowitz/Corbis Images.
Appendix 3 Opener: Chris KapolkaiStone/Getty Images.
Appendix 4 Opener: John OlsoniCorbis Images.
Pl
IN 0 E X
Page numbers AI, A2, A3, ... refer to App. I to App. 5 at the end of the book.
A
Abel 78
Absolute
convergence 667,697
frequency 994. 1000
value 607
Absolutely integrable 508
Acceleration 395, 995
Acceptance sampling 1073
Adams-Bashforth methods 899
Adams-Moulton methods 900
Adaptive 824
Addition of
complex numbers 603
matrices 275
means 1038
normal random variables 1050
power series 174. 680
variances 1039
vectors 276, 324, 367
Addition rule 1002
Adiabatic 561
ADI method 915
Adjacency matrix 956
Adjacent vertices 955
Airfoil 732
Airy's equation 552. 904
Algebraic multiplicity 337, 865
Algorithm 777, 783
Dijkstra 964
efficient 962
Ford-Fulkerson 979
Gauss 837
Gauss-Seidel 848
Greedy 967
Kruskal967
Moore 960
polynomially bounded 962
Prim 971
Runge-Kutta 892,904
stable, unstable 783
Aliasing 526
Allowable number of defectives 1073
Alternating
direction implicit method 915
path 983
Alternative hypothesis 1058
Ampere 92
Amplification 89
Amplitude spectrum 506
Analytic function 175,617,681
Analytic at infinity 711
Angle
between curves 35
between vectors 372
Angular speed 38 L 765
Annulus 613
Anticommutative 379
AOQ, AOQL 1075-1076
Approximate solution of
differential equations 9, 886-934
eigenvalue problems 863-882
equations 787-796
systems of equations 833-858
Approximation
least squares 860
polynomial 797
trigonometric 502
A priori estimate 794
AQL 1074
Arc of a curve 391
Archimedian principle 68
Arc length 393
Arctan 634
Area 435, 442, 454
Argand diagram 605
Argument 607
Artificial variable 949
Assignment problem 982
Associated Legendre functions 182
Asymptotically
equal 191, 1009
normal 1057
stable 148
n
12
Index
Attractive 148
Augmented matrix 288, 833
Augmenting path 975, 983
theorem 977, 984
Autonomous 31, 151
Average (see Mean value)
Average outgoing quality 1075
Axioms of probability 100 I
B
Back substitution 289, 834
Backward
differences 807
edge 974, 976
Euler method 896, 907
Band matrix 914
Bashforth method 899
Basic
feasible solution 942. 944
variable~ 945
Basis 49, 106. 113, 138, 300, 325, 360
Beam 120,547
Beats 87
Bellman optimality principle 963
Bell-shaped curve 1026
Bernoulli 30
distribution 1020
equation 30
law of large numbers 1032
numbers 690
Bessel 189
equation 189, 204
functions 191, 198, 202, 207, A94
functions, tables A94
inequality 215, 504
Beta function A64
Bezier curves 816
BFS 960
Bijective mapping 729
Binary 782
Binomial
coefficients 1009
distribution 1020, A96
series 689
theorem 1010
Binormal (Fig. 210) 397
Bipartite graph 982, 985
Birthday problem 1010
Bisection method 796
Bolzano-Weierstrass theorem A91
Bonnet 181
Boundary 433
conditions 203, 540, 558, 571, 587
point 433, 613
value problem 203, 558
Bounded
domain 646
function 38
region 433
sequence A69
Boxplot 995
Branch
cut 632
point 746
Breadth first search 960
Buoyance force 68
C
Cable 52, 198, 593
CAD (Computer aided design) 810
Cancellation law 321
Cantor-Dedekind axiom A69
Capacitance 92
Capacitor 92
Capacity
of a cut set 976
of an edge 973
Cardano 602
Cardioid 443
Cm1esian coordinates 366, 604
CAS (Computer algebra system) vii, 777
Catenary 399
Cauchy 69
convergence principle 667, A90
determinant 112
-Goursat theorem 647
-Hadamard formula 676
inequality 660
integral formula 654
integral theorem 647, 652
method of steepest descent 938
principal value 719, 722
product 680
-Riemann equations 37, 618, 621
-Schwarz inequality 326, 859
Cayley transformation 739
13
Index
Center 143
of a graph 973
of gravity 436, 457
of a power series 171
Central
differences 808
limit theorem 1057
moments 1019
Centrifugal, centripetal 396
Cgs system: Front cover
Chain rule 401
Characteristic
determinant 336. 864
equation 59, Ill, 336, 551, 864
function 542, 574
of a partial differential equation 551
polynomial 336, 864
value 324, 864
vector 324, 864
Chebyshev polynomials 209
Chinese postman problem 963
Chi-square 1055, 1077, AlOl
Cholesky's method 843
Chopping 782
Chromatic number 987
Circle 391
of convergence 675
Circuit 95
Circular
disk 613
helix 391, 394
membrane 580
Circulation 764
Cissoid 399
Clairaut equation 34
Class intervals 994
Closed
disk 613
integration formula 822, 827
interval A69
path 959
point set 613
region 433
Coefficient matrix 288. 833
Coefficients of a
differential equation 46
power series 171
system of equations 287, 833
Cofactor 309
Collatz's theorem 870
Column 275
space 300
sum norm 849
vector 275
Combination 1007
Combinatorial optimization 954-986
Comparison test 668
Complement 613, 988
Complementary
enor function A64
Fresnel integrals A65
sine integral A66
Complementation rule 1002
Complete
bipartite graph 987
graph 958
matching 983
orthonormal set 214
Complex
conjugate numbers 605
exponential function 57, 623
Fourier integral 519
Fourier series 497
function 614
hyperbolic functions 628, 743
impedance 98
indefinite integral 637
integration 637~663, 7l2~725
line integral 633
logarithm 630, 688
matrices 356
number 602
plane 605
plane. extended 710
potential 761
sequence 664
series 666
sphere 710
trigonometric functions 626. 688
trigonometric polynomial 524
variable 614
vector space 324. 359
Complexity 961
Component 366,374
Composite transformation 281
Compound interest 9
Compressible fluid 412
Computer
aided design (CAD) 810
algebra system (CAS) vii, 777
14
Index
Computer (Com.)
graphics 287
software (see Software)
Conchoid 399
Condition number 855
Conditionally convergent 667
Conditional probability 1003
Conduction of heat 465, 552, 757
Cone 406, 448
CONF 1049
Confidence
intervals 1049-1058
level L049
limits 1049
Conformal mapping 730. 754
Conic sections 355
Conjugate
complex numbers 605
harmonic function 622
Connected
graph 960
set 613
Conservative 415. 428
Consistent equations 292. 303
Constraints 937
Consumer's risk 1075
Continuity
of a complex function 615
equation 413
of a vector function 387
Continuous
distribution 1011
random variable 101 L. 1034
Contour integral 647
Contraction 789
Control
chart 1068
limit 1068
variables 936
Convergence
absolute 667
circle of 675
conditional 667
interval 172. 676
of an iterative process 793, 848
mean 214
mean-square 214
in norm 214
principle 667
radius 172
Convergence (Cont.)
of a sequence 386, 664
of a series ] 71, 666
superlinear 795
tests 667-672
uniform 691
Conversion of an ODE to a system 134
Convolution 248. 523
Cooling 14
Coordinate transformations A 71. A84
Coordinates
Cartesian 366, 604
curvilinear A71
cylindrical 587, A71
polar 137. 443, 580, 607
spherical 588, A71
Coriolis acceleration 396
COlTector 890, 900
COlTelation analysis 1089
Cosecant 627, A62
Cosine
of a complex variable 627, 688, 743
hyperbolic 688
integral A66. A95
of a real variable A60
Cotangent 627. A62
Coulomb 92
law 409
Covariance 1039, 1085
Cramer's rule 306-307, 312
Crank-Nicolson method 924
Critical
damping 65
point 31, 142,730
region 1060
Cross product 377
Crout's method 841
Cubic spline 811
Cumulative
distribution function 1011
frequency 994
Curl 414.430,472, A71
Curvature 397
Curve 389
arc length of 393
fitting 859
orientation of 390
piecewise smooth 421
rectifiable 393
simple 391
15
Index
Curve (Cont.)
smooth 421, 638
twisted 391
Curvilinear coordinates A71
Cut set 976
Cycle 959
Cycloid 399
Cylinder 446
flow around 763, 767
Cylindrical coordinates 587, A71
o
D' Alembert's solution 549, 551
Damping 64, 88
Dantzig 944
DATA DESK 991
Decay 5
Decreasing sequence A69
Decrement 69
Dedekind A69
Defect 337
Defective item 1073
Definite complex integral 639
Definiteness 356
Deformation of path 649
Degenerate feasible solution 947
Degree of
precision 822
a vertex 955
Degrees of freedom 1052, 1055, 1066
Deleted neighborhood 712
Delta
Dirac 242
Kronecker 210, A83
De Moi vre 610
formula 610
limit theorem 1031
De Morgan's laws 999
Density 1014
Dependent
linearly 49, 74, 106,21.)7, 300, 325
random variables 1036
Depth first search 960
Derivative
of a complex function 616, 658
directional 404
left-hand 484
right-hand 484
of a vector function 387
DERIVE 778
Descartes 366
Determinant 306, 308
Cauchy 112
characteristic 336. 864
of a matrix 305
of a matrix product 321
Vandermonde 112
DFS 960
Diagonalization 351
Diagonal matrix 284
Diagonally dominant matrix 868
Diameter of a graph 973
Differences 802, 804. 807-808
Difference table 803
Differentiable complex function 616
Differential 19, 429
form 20, 429
geometry 389
operator 59
Differential equation (ODE and PDE)
Airy 552, 904
Bernoulli 30
Bessel 189. 204
Cauchy-Riemann 37, 618. 621
with constant coefficients 53, III
elliptic 551, 909
Euler-Cauchy 69, 185
exact 20
homogeneous 27, 46, 105, 535
hyperbolic 551,928
hypergeometric 188
Laguerre 257
Laplace 407. 465. 536. 579. 587. 910
Legendre 177, 204. 590
linear 26, 45, 105, 535
nonhomogeneous 27, 46, 105, 535
nonlinear 45, 151,535
numeric methods for 886-934
ordinary 4
parabolic 551, 909, 922
partial 535
Poisson 910. 918
separable 12
Sturm-Liouville 203
of vibrating beam 547, 552
of vibrating mass 61,86, 135, 150,243,261,
342,499
of vibrating membrane 569-586
of vibrating string 538
16
Index
Differentiation
analytic functions 691
complex functions 616
Laplace transforms 254
numeric 827
power series 174, 680
series 696
vector functions 387
Diffusion equation 464. 552
Diffusivity 552
Digraph 955
Dijkstra's algorithm 964
Dimension of vector space 300, 325. 369
Diodes 399
Dirac's delta 242
Directed
graph 955
line segment 364
path 982
Directional derivative 404
Direction field 10
Direct method 845
Dirichlet 467
discontinuous factor 509
problem 467, 558, 587,915
Discharge of a source 767
Discrete
Fourier transform 525
random variable 1011. 1033
spectrum 507, 524
Disjoint events 998
Disk 613
Dissipative 429
Distribution 1010
Bernoulli 1020
binomial 1020, A96
chi-square 1055. AIOI
continuous 1011
discrete 1011.1033
Fisher's F- 1066, AI02
-free test 1080
function 1011. 1032
Gauss 1026
hypergeometric 1024
marginal 1035
multinomial 1025
normal 1026. 1047-1057, 1062-1067, A98
Poisson 1022, 1073. A97
Student's t- 1053. AIOO
two-dimensional 1032
Distribution (COli f.)
uniform 1015, 1017,1034
Divergence
theorem of Gauss 459
of vector fields 410, A 72
Divergent
sequence 665
series 171, 667
Di vided differences 802
Division of complex numbers 604, 609
Domain 401,613,646
Doolittle's method 841
Dot product 325, 371
Double
Fourier series 576
integral 433
labeling 968
precision 782
Driving force 84
Drumhead 569
Duffing equation 159
Duhamel's formula 597
E
Eccentricity of a vertex 973
Echelon form 294
Edge 955
coloring 987
incidence list 957
Efficient algorithm 962
Eigenbasis 349
Eigenfunction 204, 542, 559, 574
expansion 210
Eigenspace 336, 865
Eigenvalue problems for
matrices (Chap. 8) 333-363
matrices, numerics 863-882
ODEs (Sturm-Liouville problems) 203-216
PDEs 540-593
systems of ODEs 130-165
Eigenvector 334, 864
EISPACK 778
Elastic membrane 340
Electrical network (see Networks)
Electric circuit (see Circuit)
Electromechanical analogies 96
Electromoti ve force 91
Electrostatic
field 750
17
Index
Electrostatic (COllt.)
potential 588-592, 750
Element of matrix 273
Elementary
matrix 296
operations 292
Elimination of first derivative 197
Ellipse 390
Ellipsoid 449
Elliptic
cylinder 448
paraboloid 448
partial differential equation 551, 909
Empty set 998
Engineering system: Front cover
Entire function 624,661, 711
Entry 273, 309
Equality of
complex numbers 602
matrices 275
vectors 365
Equally likely 1000
Equipotential
lines 750, 762
surfaces 750
Equivalence relation 296
Equivalent linear systems 292
Erf 568, 690. A64, A95
Error 783
bound 784
estimation 785
function 568, 690. A64. A95
propagation 784
Type I, Type II 1060
Essential singularity 708
Estimation of parameters 1046-1057
Euclidean
norm 327
space 327
Euler 69
backward methods for ODEs 896, 907
beta function A64
-Cauchy equation 69, 108. 116. 185. 589
-Cauchy method 887, 890, 903
constant 200
formula 58, 496, 624, 627, 687
formulas for Fourier coefficients 480, 487
graph 963
numbers 690
method for systems 903
Euler (Cont.)
trail 963
Evaporation L8
Even function 490
Event 997
Everett interpolation formula 809
Exact
differential equation 20
differential form 20, 429
Existence theorem
differential equations 37, 73, 107, 109, 137,
175
Fourier integral 508
Fourier series 484
Laplace transforms 226
Linear equations 302
Expectation 10 16
Experiment 997
Explicit solution 4
Exponential
decay 5
function, complex 57, 623
fUnction, real A60
growth 5, 31
integral A66
Exposed vertex 983
Extended complex plane 710, 736
Extension. periodic 494
Extrapolation 797
Extremum 937
F
Factorial function 192, 1008, A95
Failure 1021
Fair die 1000
Falling body 8
False position 796
Family of curves 5, 35
Faraday 92
Fast Fourier transform 526
F-distribution 1066. A 102
Feasible solution 942
Fehlberg 894
Fibonacci numbers 683
Field
conservative 415. 428
of force 385
gravitational 385,407,411,587
18
Index
Field (Cant.)
irrotational 415, 765
scalar 384
vector 384
velocity 385
Finite complex plane 710
First
fundamental form 457
Green's formula 466
shifting theorem 224
Fisher, R. A. 1047, 1066, 1077
F -distribution I 066, A 102
Fixed
decimal point 781
point 736, 781, 787
Flat spring 68
Floating point 781
Flow augmenting path 975
Flows in networks 973-981
Fluid flow 412,463, 761
Flux 412, 450
integral 450
Folium of Descarte" 399
Forced oscillations 84, 499
Ford-Fulkerson algorithm 979
Forest 970
Form
Hermitian 361
quadratic 353
skew-Hermitian 361
Forward
differences 804
edge 974. 976
Four-color theorem 987
Fourier 477
-Bessel series 213, 583
coefficients 480, 487
coefficients, complex 497
constants 210
cosine integral 511
cosine series 491
cosine transform 514, 529
double series 576
half-range expansions 494
integral 508, 563
integral, complex 519
-Legendre series 212, 590
matrix 525
series 211, 480. 487
series, complex 497
Fourier (Cant.)
series, generalized 210
sine integral 511
sine series 491, 543
sine transform 514.530
transform 519. 531, 565
transform, discrete 525
transform, fast 526
Fractional linear transformation 734
Fraction defective J073
Fredholm 20 I
Free
fall 18
oscillations 61
Frene! formulas 400
Frequency 63
of values in samples 994
Fresnel integrals 690, A65
Friction 18-19
Frobenius 182
method 182
norm 849
theorem 869
Fulkerson 979
Full-wave rectifier 248
Function
analytic 175, 617
Bessel 191, 198,202,207, A94
beta A64
bounded 38
characteristic 542, 574
complex 614
conjugate harmonic 622
entire 624, 661. 71 I
error 568, 690, AM, A95
even 490
exponential 57, 623, A60
factorial 192, 1008, A95
gamma 192, A95
Hankel 202
harmonic 465, 622, 772
holomorphic 617
hyperbolic 628, 743. A62
hypergeometric 188
inverse hyperbolic 634
inverse trigonometric 634
Legendre 177
logarithmic 630. A60
meromorphic 711
Neumann 201
Index
Function (Collt.)
odd 490
orthogonal 205, 482
orthonormal 205, 210
periodic 478
probability 1012. 1033
rational 617
scalar 384
space 382
staircase 248
step 234
trigonometric 626, 688, A60
unit step 234
vector 384
Function space 327
Fundamental
form 457
matrix 139
mode 542
period 485
system -1-9, 106, 113, 138
theorem of algebra 662
G
Galilei 15
Gamma function 192, A95
GAMS 778
Gau~s 188
distribution 1026
divergence theorem 459
elimination method 289. 834
hypergeometric equation 188
integration formula 826
-Jordan elimination 317, 844
least squares 860, 1084
quadrature 826
-Seidel iteration 846, 913
General
powers 632
solution 64. 106. 138, 159
Generalized
Fourier series 210
function 242
solution 545
triangle inequality 608
Generating function 181,216,258
Geometric
multiplicity 337
series 167,668,673,687,692
19
Gerschgorin's theorem 866
Gibbs phenomenon 490, 510
Global error 887
Golden Rule 15,23
Goodness of fit 1076
Gosset 1066
Goursat 648, A88
Gradient 403, 415. 426, A72
method 938
Graph 955
bipartite 982
complete 958
Euler 963
planar 987
sparse 957
weighted 959
Gravitation 385, 407, 411, 587
Greedy algorithm 967
Greek alphabet: Back cover
Green 439
formulas 466
theorem 439, ..J.66
Gregory-Newton formulas 805
Growth restriction 225
Guldin's theorem 458
H
Hadamard's formula 676
Half-life time 9
Half-plane 613
Half-range Fourier series 494
Half-wave rectifier 248, 489
Halving 819, 824
Hamiltonian cycle 960
Hanging cable 198
Hankel functions 202
Hard spring 159
Harmonic
conjugate 622
function 465, 622. 772
oscillation 63
series 670
Heat
equatiun 464, 536, 553, 757, 923
potential 758
Heaviside 221
expansions 245
formulas 247
110
Index
Heaviside (Cont.)
function 234
Helicoid 449
Helix 391, 394, 399
Helmholtz equation 572
Henry 92
Hermite
interpolation 816
polynomials 216
Hermitian 357, 361
Hertz 63
Hesse's normal form 375
Heun's method 890
High-frequency line equations 594
Hilbert 201, 326
matrix 858
space 326
Histogram 994
Holomorphic 617
Homogeneous
differential equation 27, 46, 105. 535
system of equations 288, 304, 833
Hooke's law 62
Householder's tridiagonalization 875
Hyperbolic
differential equation 551,909,928
functions, complex 628, 743
functions, real A62
paraboloid 448
partial differential equations 551, 928
spiral 399
Hypergeometric
differential equation 188
distribution 1024
functions 188
series 188
Hypocycloid 399
Hypothesis 1058
Idempotent matrix 286
Identity
of Lagrange 383
matrix (see Unit matrix)
theorem for power ~eries 679
transformation 736
trick 351
Ill-conditioned 851
Image 327, 729
Imaginary
axis 604
part 602
unit 602
Impedance 94, 98
Implicit solution 20
Improper integral 222. 719. 722
Impulse 242
IMSL 778
Incidence
list 957
matrix 958
Inclusion theorem ~68
Incomplete gamma function A64
Incompressible 765
Inconsistent equations 292, 303
Increasing sequence A69
Indefinite
integral 637, 650
integration 640
Independence of path 426, 648
Independent
events 1004
random variables 1036
Indicial equation 184
Indirect method 845
Inductance 92
Inductor 92
Inequality
Bessel 215, 504
Cauchy 660
ML- 644
Schur 869
triangle 326. 372. 608
Infinite
dimensional 325
population 1025, 1045
sequence 664
series (see Series)
Infinity 710, 736
Initial
condition 6, 48, 137, 540
value problem 6. 38.48. 107,886,902
Injective mapping 729
Inner product 325. 359, 371
space 326
lnput 26, 84. 230
Instability (see Stability)
Integral
contour 647
111
Index
Integral (Cant.)
definite 639
double 433
equation 252
Fourier 508, 563
improper 719. 722
indefinite 637. 650
line 421, 633
surface 449
theorems, complex 647, 654
theorems, real 439, 453, 469
transform 221, 513
triple 458
Integrating factor 23
Integration
complex functions 637-663, 701-727
Laplace transforms255
numeric 8l7-827
power series 680
series 695
Integra-differential equation 92
Interest 9, 33
Interlacing of zeros 197
Intermediate value theorem 796
Interpolation 797-815
Hermite 816
Lagrange 798
Newton 802, 805, 807
spline 811
Interquartile range 995
Intersection of events 998
Interval
closed A69
of convergence 172, 676
estimate 1046
open 4, A69
Invariant subspace 865
Inverse
hyperbolic functions 634
mapping plinciple 733
of a matrix 315, 844
trigonometric functions 634
Inversion 735
Investment 9, 33
Irreducible 869
Irregular boundary 919
Irrotational415,765
Isocline 10
Isolated singularity 707
Isotherms 758
Iteration
for eigenvalues 872
for equations 787-794
Gauss-Seidel 846. 913
Jacobi 850
Picard 41
J
Jacobian 436, 733
Jacobi iteration 850
Jerusalem, Shrine of the Book 814
Jordan 316
Joukowski airfoil 732
K
Kirchhoff's laws 92, 973
Kronecker delta 210, A83
Kruskal's algorithm 967
Kutta 892
L
l10 l2, lex; 853
L 2 863
Labeling 968
Lagrange 50
identity of 383
interpolation 798
Laguerre polynomials 209, 257
Lambert's law 43
LAPACK 778
Laplace 221
equation 407, 465, 536, 579, 587, 910
integrals 512
limit theorem 1031
operator 408
transform 221, 594
Laplacian 443, A73
Latent root 324
Laurent series 701, 712
Law of
absorption 43
cooling 14
gravitation 385
large numbers 1032
mass action 43
the mean (see Mean value theorem)
LC-circuit 97
III
Index
LCL 1068
Least squares 860, 1084
Lebesgue 863
Left-hand
derivative 484
limit 484
Left-handed 378
Legendre 177
differential equation 177, 204, 590
functions 177
polynomials 179, 207, 590, 826
Leibniz 14
convergence test A 70
Length
of a curve 393
of a vector 365
Leonardo of Pisa 638
Leontief 344
Leslie model 341
Libby 13
Liebmann's method 913
Likelihood function 1047
Limit
of a complex function 615
cycle 157
left-hand 484
point A90
right-hand 484
of a sequence 664
vector 386
of a vector function 387
Lineal element 9
Linear
algebra 271-363
combination 106, 325
dependence 49, 74, 106, 108, 297, 325
differential equation 26, 45, 105, 535
element 394, A 72
fractional transformation 734
independence 49, 74, 106, 108, 297, 325
interpolation 798
operator 60
optimization 939
programming 939
space (see Vector space)
system of equations 287, 833
transformation 281, 327
Linearization of systems of ODEs 151
Line integral 421, 633
Lines of force 751
UNPACK 779
Liouville 203
theorem 661
Lipschitz condition 40
List 957
Ljapunov 148
Local
error 887
minimum 937
Logarithm 630, 688, A60
Logarithmic
decrement 69
integral A66
spiral 399
Logistic population law 30
Longest path 959
Loss of significant digits 785
Lotka-Volterra population model 154
Lot tolerance per cent defective 1075
Lower
control limit 1068
triangular matlix 283
LTPD 1075
LU-factorization 841
M
Maclaurin 683
series 683
trisectrix 399
Magnitude of a vector (see Length)
Main diagonal 274, 309
Malthus's law 5, 31
MAPLE 779
Mapping 327, 729
Marconi 63
Marginal distributions 1035
Markov process 285, 341
Mass-sPling system 61, 86, 135, 150, 243, 252,
261,342,499
Matching 985
MATHCAD 779
MATHEMATICA 779
Mathematical expectation 1019, 1038
MATLAB 779
Matlix
addition 275
augmented 288, 833
band 914
Index
Matrix (COllt.)
diagonal 284
eigenvalue problem 333-363. 863-882
Hermitian 357
identity (see Unit matrix)
inverse 315
inversion 315, 844
multiplication 278, 321
nonsingular 315
norm 849. 854
normal 362. 869
null (see Zero matrix)
orthogonal 345
polynomial 865
scalar 284
singular 315
skew-Hermitian 357
skew-symmetric 283, 345
sparse 812, 912
square 274
stochastic 285
symmetric 283, 345
transpose 282
triangular 283
tridiagonal 812. 875, 914
unit 284
unitary 357
zero 276
Max-flow min-cut theorem 978
Maximum 937
flow 979
likelihood method 1047
matching 983
modulus theorem 772
principle 773
Mean convergence 214
Mean-square convergence 214
Mean value of a (an)
analytic function 771
distribution IO 16
function 764
harmonic function 772
sample 996
Mean value theorem 402, 434, 454
Median 994, 1081
Membrane 569-586
Meromorphic function 711
Mesh incidence matrix 278
Method of
false position 796
113
Method of (Com.)
least squares 860. 1084
moments 1046
steepest descent 938
undetermined coefficients 78, 117, 160
variation of parameters 98, 118, 160
Middle quartile 994
Minimum 937, 942, 946
MINITAB 991
Minor 309
Mixed
boundary value problem 558. 587. 759. 917
triple product 381
Mixing problem 13, 130, ]46, 163,259
Mks system: Front cover
ML-inequality 644
Mobius 453
strip 453, 456
transformation 734
Mode 542. 582
Modeling 2. 6. 13. 61. 84. 130. 159. 222, 340, 499,
538,569.750-767
Modified Bessel functions 203
Modulus 607
Molecule 912
Moivre's formula 610
Moment
central 1019
of a distribution 1019
of a force 380
generating function 1026
of inertia 436. 455. 457
of a sample 1046
vector 380
Monotone sequence A69
Moore's shortest path algorithm 960
Morera's theorem 661
Moulton 900
Moving trihedron (see Trihedron)
M-test for convergence 969
Multinomial distribution 1025
Multiple point 391
Multiplication of
complex numbers 603, 609
determinants 322
matrices 278, 321
means ]038
power series 174, 680
vectors 2]9, 371, 377
Multiplication rule for events 1003
114
Index
Multiplicity 337, 865
Multiply connected domain 646
Multistep method 898
"Multivalued function" 615
Mutually exclusive events 998
N
Nabla 403
NAG 779
Natural
frequency 63
logarithm 630, A60
spline 812
Neighborhood 387, 613
Nested form 786
NETLIB 779
Networks 132, 146, 162,244,260.263.277,331
in graph theory 973
Neumann, C. 201
functions 20 I
problem 558, 587, 917
Newton 14
-Cotes formulas 822
interpolation formulas 802, 805, 807
law of cooling 14
law of gravitation 385
method 800
-Raphson method 800
second law 62
Neyman 1049, 1058
Nicolson 924
Nicomedes 399
Nilpotent matrix 286
NIST 779
Nodal
incidence matrix 277
line 574
Node 142, 797
Nonbasic variables 945
Nonconservative 428
Nonhomogeneous
differential equation 27, 78, 116, 159, 305, 535
system of equations 288. 304
Nonlinear differential equations 45. 151
Nonorientable smface 453
Nonparametric test 1080
Nonsingular matrix 315
Nonn 205, 326, 346, 359, 365, 849
Normal
acceleration 395
asymptotically 1057
to a curve 398
derivative 444, 465
distribution 1026, 1047-1057, 1062-1067, A98
two-dimensional 1090
equations 860. 1086
form of a PDE 551
matrix 362, 869
mode 542, 582
plane 398
to a plane 375
random variable 1026
to a surface 447
vector 375, 447
Null
hypothesis 1058
matrix (see Zero matlix)
space 301
vector (see Zero vector)
Nullity 301
Numeric methods 777-934
differentiation 827
eigenvalues 863-882
equations 787-796
integration 817-827
interpolation 797-816
linear equations 833-858
matrix inversion 315, 844
optimization 936-953
ordinary differential equations (ODEs)
886-908
partial differential equations (PDEs) 909-930
Nystrom method 906
o
0962
Objective function 936
OC curve 1062
Odd function 490
ODE 4 (see also Differential equations)
Ohm's law 92
One-dimensional
heat equation 553
wave equation 539
One-sided
derivative 484
test 1060
115
Index
One-step method 898
One-to-one mapping 729
Open
disk 613
integration formula 827
interval A69
point set 613
Operating characteristic 1062
Operational calculus 59, 220
Operation count 838
Operator 59. 327
Optimality principle. Bellman's 963
Optimal solution 942
Optimization 936-953, 959-990
Orbit 141
Order 887, 962
of a determinant 308
of a differential equation 4, 535
of an iteration process 793
Ordering 969
Ordinary differential equations 2-269. 886-908
(see also Differential equation)
Orientable surface 452
Orientation of a
curve 638
surface 452
Orthogonal
coordinates A 71
curves 35
eigenvectors 350
expansion 210
functions 205, 482
matrix 345
polynomials 209
series 210
trajectories 35
transformation 346
vectors 326, 371
Orthonormal functions 205, 210
Oscillations
of a beam 547,552
of a cable 198
in circuits 9 I
damped 64, 88
forced 84
free 61, 500, 547
harmonic 63
of a mass on a spring 61, 86, 135, 150, 243,
252, 261, 342.499
of a membrane 569-586
Oscillations (Cont.)
self-sustained 157
of a string 204, 538. 929
undamped 62,
Osculating plane 398
Outcome 997
Outer product 377
Outlier 995
Output 26, 230
Overdamping 65
Overdetermined system 292
Overflow 782
Overrelaxation 851
Overtone 542
p
Paired comparison 1065
Pappus's theorem 458
Parabolic differential equation 551, 922
Paraboloid 448
Parachutist 12
Parallelepiped 382
Parallel flow 766
Parallelogram
equality 326, 372, 612
law 367
Parameter of a distribution 1016
Parametric representation 389, 446
Parking problem 1023
Parseval's equality 215. 504
Partial
derivative 388, A66
differential equation 535, 909
fractions 231, 245
pivoting 291, 834
sum 171,480, 666
Particular solution 6, 48. 106, 159
Pascal 399
Path
in a digraph 974
in a graph 959
of integration 421, 637
PDE 535, 909 (see a/so Differential equation)
Peaceman-Rachford method 915
Pearson. E. S. 1058
Pearson, K. 1066
Pendulum 68, 152, 156
Period 478
116
Index
Periodic
extension 494
function 478
Permutation 1006
Perron-Frobenius theorem 344, 869
Pfaffian form 429
Phase
angle 88
of complex number (see Argument)
lag 88
plane 141, 147
pOltrait 14 I, 147
Picard
iteration method 41
theorem 709
Piecewise
continuous 226
smooth 421, 448. 639
Pivoting 291, 834
Planar graph 987
Plane 315, 375
Plane curve 391
Poincare 216
Point
estimate 1046
at infinity 710. 736
set 613
source 765
spectrum 507,524
Poisson 769
distribution 1022, 1073, A97
equation 910, 918
integral formula 769
Polar
coordinates 437, 443, 580, 607-608
form of complex numbers 607
moment on inertia 436
Pole 708
Polynomial
approximation 797
matrix 865
Polynomially bounded 962
Polynomials 617
Chebyshev 209
Hermite 216
Laguerre 207, 257
Legendre 179. 207. 590. 826
trigonometric 502
Population in statistics 1044
Population models 31, 154. 341
Position vector 366
Positive definite 326. 372
Possible values 1012
Postman problem 963
Potential 407, 427,590, 750, 762
complex 763
theory 465, 749
Power
method for eigenvalues 872
of a test 1061
series 167, 673
series method 167
Precision 782
Predator-prey 154
Predictor-corrector 890, 900
Pre-Hilbert space 326
Prim's algorithm 971
Principal
axes theorem 354
branch 632
diagonal (see main diagonal)
directions 340
normal (Fig. 210) 397
part 708
value 607. 630. 632. 719. 722
Prior estimate 794
Probability 1000, 10m
conditional 1003
density 1014. 1034
distribution 10 10, 1032
function 1012, 1033
Producer's risk 1075
Product (see Multiplication)
Projection of a vector 374
Pseudocode 783
Pure imaginary number 603
Q
QR-factorization method 879
Quadratic
equation 785
form 353
interpolation 799
Qualitative methods 124. 139-165
Quality control 1068
Quartile 995
Quasilinear 551, 909
Quotient of complex numbers 604
117
Index
R
Rachford method 915
Radiation 7, 561
Radiocarbon dating 13
Radius
of convergence 172, 675, 686
of a graph 973
Random
experiment 997
numbers 1045
variable 1010. 1032
Range of a
function 614
sample 994
Rank of a matrix 297, 299. 31 I
Raphson 790
Rational function 617
Ratio test 669
Rayleigh 159
equation 159
quotient 872
RC-circuit 97, 237, 240
Reactance 93
Real
axis 604
part 602
sequence 664
vector space 324, 369
Rectangular
membrane 571
pulse 238. 243
rule 817
wave 21 1, 480, 488, 492
Rectifiable curve 393
Rectification of a lot 1075
Rectifier 248, 489, 492
Rectifying plane 398
Reduction of order 50, 116
Region 433,614
Regression 1083
coefficient Im;5, L088
line 1084
Regula falsi 796
Regular
point of an ODE 183
Sturm-Liouville problem 206
Rejectable quality level 1075
Rejection region 1060
Relative
class frequency 994
Relati ve (Cont.)
error 784
frequency 1000
Relaxation 850
Remainder 171, 666
Removable singularity 709
Representation 328
Residual 849, 852
Residue 713
Residue theorem 715
Resistance 91
Resonance 86
Response 28, 84
Restoring force 62
Resultant of forces 367
Riccati equation 34
Riemann 618
sphere 710
surface 746
Right-hand
derivative 484
limit 484
Right-handed 377
Risk 1095
RL-circuit 97, 240
RLC-circuit 95, 241, 244, 499
Robin problem 558, 587
Rodrigues's formula 181
Romberg integration 829
Root 610
Root test 671
Rotation 381, 3R5, 414, 734, 764
Rounding 782
Row
echelon form 294
-equivalent 292, 298
operations 292
scaling 838
space 300
sum norm 849
vector 274
Runge-Kutta-Fehlberg 893
Runge-Kutta methods 892, 904
Runge-Kutta-Nystrom 906
s
Saddle point 143
Sample 997, 1045
covariance 1085
118
Index
Sample (Cont.)
distribution function 1076
mean 996, 1045
moments 1046
point 997
range 994
size 997, 1045
space 997
standard deviation 996, 1046
variance 996, 1045
Sampling 1004, 1023, 1073
SAS 991
Sawtooth wave 248, 493, 505
Scalar 276, 364
field 384
function 384
matIix 284
multiplication 276, 368
triple product 381
Scaling 838
Scheduling 987
Schoenberg 810
Scrnodinger 242
Schur's inequality 869
Schwartz 242
Schwarz inequality 326
Secant 627, A62
method 794
Second
Green's formula 466
shifting theorem 235
Sectionally continuous
(see Piecewise continuous)
SeIdel 846
Self-starting 898
Self-sustained oscillations 157
Separable differential equation 12
Separation of variables 12, 540
Sequence 664, A69
Series 666, A69
addition of 680
of Bessel functions 213, 583
binomial 689
convergence of 171, 666
differentiation of 174. 680
double Fourier 576
of eigenfunctions 210
Fourier 211,480.487
geometIic 167.668,673,687.692
harmonic 670
Series (Cont.)
hypergeometric 188
infinite 666, A70
integration of 680
Laurent 701, 712
MaclauIin 683
multiplication of 174. 680
of orthogonal functions 210
partial sums of 17 L 666
power 167, 673
real A69
remainder of 171, 666, 684
sum of 171,666
Taylor 683
tligonometric 479
value of 171, 666
Serret-Frenet formulas
(see Frenet formulas)
Set of points 613
Shifted data problem 232
Shifting theorems 224, 235, 528
Shortest
path 959
spanning tree 967
Shrine of the Book 814
Sifting 242
Significance level 1059
Significant
digit 781
in statistics 1059
Sign test 1081
Similar matrices 350
Simple
curve 391
event 997
graph 955
pole 708
zero 709
Simplex
method 944
table 945
Simply connected 640, 646
Simpson's rule 821
Simultaneous
corrections 850
differential equations 124
linear equations (see Linear systems)
Sine
of a complex variable 627, 688, 742
hyperbolic 688
119
Index
Sine (Cont.)
integral 509, 690, A65, A95
of a real variable A60
Single precision 782
Single-valued relation 615
Singular
at infinity 711
matrix 315
point 183, 686, 707
solution 8, 50
Sturm-Liouville problem 206
Singularity 686, 707
Sink 464, 765, 973
SI system: Front cover
Size of a sample 997. 1045
Skew-Hermitian 357, 361
Skewness L020
Skew-symmetric matrix 283, 345
Skydiver 12
Slack variable 941
Slope field 9
Smooth
curve 42 I, 638
piecewise 421, 448, 639
surface 448
Sobolev 242
Soft spring 159
Software 778, 991
Solution
of a differential equation 4, 46, 105, 536
general 6, 48, 106, 138
particular 6, 48, 106
singular 8, 50
space 304
steady-state 88
of a system of differential equations 136
of a system of equation~ 288
vector 288
SOR 851
Sorting 969, 993
Source 464. 765, 973
Span 300
Spanning tree 967
Sparse
graph 957
matrix 812, 912
system of equations 846
Spectral
density 520
mapping theorem 344,865
Spectral (Cont.)
radius 848
representation 520
shift 344, 865, 874
Spectrum 324, 542, 864
Speed 394
Sphere 446
Spherical coordinates 588, A 71
Spiral 399
point 144
Spline 81 I
S-PLUS, SPSS 992
Spring 62
Square
error 503
matrix 274
membrane 575
root 792
wave 211, 480, 488, 492
Stability 3 I, 148, 783, 822, 922
chart 148
Stagnation point 763
Staircase function 248
Standard
basis 328, 369
deviation 10 I 6
form of a linear ODE 26, 45, 105
Standardized random variable 1018
Stationary point 937
Statistical
inference 1044
tables A96-A I 06
Steady 413, 463
state 88
Steepest descent 938
Steiner 399, 457
Stem-and-leaf plot 994
Stencil 912
Step-by-step method 886
Step function 234
Step size control 889
Stereographic projection 7 I I
Stiff
ODE 896
system of ODEs 907
Stirling formula 1008, A64
Stochastic
matrix 285
variable 10 11
Stokes's theorem 469
120
Index
Straight line 375, 391
Stream function 762
Streamline 761
Strength of a source 767
Stlictly diagonally dominant 868
String 204, 538, 594
Student's (-distribution 1053, A I 00
Sturm-Liouville problem 203
Subgraph 956
Submarine cable equations 594
Submatrix 302
Subsidiary equation 220, 230
Subspace 300
invariant 865
Success [021
Successive
conections 850
overrelaxation 851
Sum (see Addition)
Sum of a selies 171,666
Superlinear convergence 795
Superposition principle 106, 138
Surface 445
area 435, 442, 454
integral 449
normal 406
Surjective mapping 729
Symmetric matrix 283, 345
System of
differential equations 124-165,258-263,902
linear equations (see Linear system)
units: Front cover
T
Tables
on differentiation: Front cover
of Fourier transforms 529-531
of functions A94-A 106
of integrals: Front cover
of Laplace transforms 265-267
statistical A96-A106
Tangent 627, A62
to a curve 392, 397
hyperbolic 629, A62
plane 406, 447
vector 392
Tangential acceleration 395
Target 973
Tarjan 971
Taylor 683
formula 684
series 683
Tchebichef (see Chebyshev)
(-distribution 1053, A100
Telegraph equations 594
Termination critelion 791
Termwise
differentiation 696
integration 695
multiplication 680
Test
chi-square 1077
for convergence 667-672
of hypothesis 1058-1 06R
nonparametric 1080
Tetrahedron 382
Thermal
conductivity 465, 552
diffusivity 465, 552
Three
-eights rule 830
-sigma limits 1028
Time tabling 987
Torricelli's law 15
Torsion of a curve 397
Torsional vibrations 68
Torus 454
Total differential 19
Trace of a matrix 344, 355, !:I64
Trail 959
TrajectOlies 35, 133, 141
Transfer function 230
Transformation
of Cartesian courdinates A84
by a complex function 729
of integrals 437, 439, 459, 469
linear 281
linear fractional 734
orthogonal 346
similarity 350
unitary 359
of vector components A83
Transient state 88
Translation 365, 734
Transmission line equations 593
Transpose of a matrix 282
Transpositions 1081. AI06
Trapezoidal rule 817
Traveling salesman problem 960
Index
Tree 966
Trend 1081
Trial 997
Triangle inequality 326, 372. 608
Triangular matlix 283
Tricomi equation 551. 55~
Tridiagonalization 875
Tridiagonal matrix 812, 875. 914
Trigonometric
approximation 502
form of complex numbers 607, 624
functions, complex 626, 688
functions, real A60
polynomial 502
series 479
system 479. 482
Trihedron 398
Triple
integral 458
product 381
Tlivial solution 27, 304
Truncation error 783
Tuning 543
Twisted curve 391
Two-dimensional
distribution 1032
normal distribution 1090
random variable 1032
wave equation 571
Two-sided test 1060
Type of a differential equation 551
Type I and II errors 1060
U
UeL 1068
Unconstrained optimization 937
Uncorrelated I 090
U mlamped system 62
Underdamping 66
Underdetemlined system 292
Underflow 782
Undetermined coefficients 79, 117, 160
Uniform
convergence 691
distribution 1015, 1017. 1034
Union of events 998
Uniqueness
differential equations 37, 73, 107, 137
Dirichlet problem 774
121
Uniqueness (Cont.)
Laurent series 705
linear equations 303
power series 678
Unit
binormal vector 398
circle 611
impulse 242
matrix 284
normal vector 447
principal normal vector 391:\
step function 234
tangent vector 392. 398
vector 326
Unitary
matrix 357
system of vectors 359
transformation 359
Unstable (see Stability)
Upper control limit 1068
v
Value of a series 171. 666
Vandermonde determinant I 12
Van der Pol equation 157
Variable
complex 614
random 10 I0, 1032
standardized random 1018
stochastic I 0 11
Variance of a
distribution 1016
sample 996, 1045
Variation of parameters 98. 118, 160
Vector 274, 364
addition 276. 327, 367
field 384
function 384
moment 380
norm 853
product 377
space 300, 323, 369
subspace 300
Velocity 394
field 385
potential 762
vector 394
Venn diagram 998
Verhulst 31
122
Index
Vertex 955
coloring 987
exposed 983
incidence list 957
Vibrations (see Oscillations)
Violin string 538
Vizing's theorem 987
Volta 92
Voltage drop 92
Volterra 154,201, 253
Volume 435
Vortex 767
Vorticity 764
w
Waiting time problem 10 13
Walk 959
Wave equation 536, 539, 569, 929
Weber 217
equation 217 .
Weber (Cont.)
functions 201
Website see Preface
Weierstrass 618. 696
approximation theorem 797
M-test 696
Weighted graph 959
Weight function 205
Well-conditioned 852
Wessel 605
Wheatstone bridge 296
Work 373, 423
integral 422
Wronskian 75, 108
z
Zero
of analytic function 709
matrix 276
vector 367
Systems of Units. Some Important Conversion Factors
The most important systems of units are shown in the table below. The mks system is also known as
the bztemational System of Units (abbreviated S/), and the abbreviations s (instead of sec),
g (instead of gm), and N (instead of nt) are also used.
Length
System of units
Mass
Time
Force
cgs system
centimeter (cm)
gram (gm)
second (sec)
dyne
mks system
meter (m)
kilogram (kg)
second (sec)
newton (nt)
Engineering system
foot (ft)
slug
second (sec)
pound (lb)
I inch (in.) = 2.540000 cm
I yard (yd)
I foot (ft)
= 3 ft = 91.440000 cm
= 12 in. = 30.480000 cm
I statute mile (mi)
= 5280 ft = 1.609344 km
I mi 2 = 640 acres
= 2.5899881
I nautical mile = 6080 ft = 1.853184 km
I acre
= 4840 yd2 = 4046.8564 m 2
= IIl28 U.S. gallon = 2311128 in. 3 = 29.573730 cm 3
I fluid ounce
I U.S. gallon = 4 quarts (liq) = 8 pints (liq)
I British Imperial and Canadian gallon
I slug
=
= 128 fl oz = 3785.4118 cm3
1.200949 U.S. gallons
= 4546.087 cm 3
= 14.59390 kg
I pound (lb)
= 4.448444 nt
I British thermal unit (Btu)
I calorie (cal)
I newton (nt)
= 1054.35 joules
= 105 dynes
I joule = 107 ergs
= 4.1840 joules
I kilowatt-hour (kWh)
= 3414.4 Btu = 3.6' 106 joules
I horsepower (hp) = 2542.48 Btu/h
I kilowatt (kW)
OF
km 2
= 178.298 cal/sec = 0.74570 kW
= 1000 watts = 3414.43 Btu/h = 238.662 cal/sec
= °C . 1.8 + 32
1° = 60' = 3600"
= 0.01 7453293 radian
For further details see, for example, D. Halliday, R. Resnick, and 1. Walker, FUl1damel1tals of Physics. 7th ed., New York:
Wiley. 2005. See also AN American National Standard. ASTMIlEEE Standard Metric Practice. Institute of Electrical and
Electronics Engineers, Inc., 445 Hoes L~ne. Piscataway, N. 1. 08854.
Integration
Differentiation
(eu)' = eu'
(e constant)
f uv' dx =
f u'v dx
ltV -
xn+l
(u
+ v)' = u' + v'
(uv)' = u'v
+ v'u
,
,
(: )'
2
du dy
du
-=-.dx
dy dx
(Chain rule)
f sin x dx = -cos x +
f cos dx = sin +
x
=
f sec x dx
(tan x)'
= sec2 x
(cotx), = -csc 2 x
(sinh x)' = cosh x
(cosh x)'
= sinh x
,
I
(lnx) = x
+e
-In Icos xl
In Isec x
=
e
e
x
f cotxdx = In Isinxl
(cosx)' = -sinx
(n oF -I)
feaxdx=~eax+e
f tan x dx
(sin x)' = cos x
e
e
uv-vu
v
+
f
f ~ dx = In Ixl +
xndx = - - n + 1
+
e
+ tan xl +
f csc x dx = In Icsc x - cot \:1 +
dx
x
2
2 = - arctan - +
f +
f Ya dx x = arcsin'::'a + e
f Yx dx+ = arcsinh .::. +
f Yx dx a = arccosh .::.a +
I
x
a
2
a
e
e
2
-
2
2
a
e
a
e
2
a
e
2
-
f sin2 x dx
= 12 x
- 14 sin 2x
+
e
! x + ~ sin 2x + e
f cos 2 X dx =
f tan 2 x dx = tan x - x + e
x
(arcsin x)'
(arccos x)'
f cot2 = -cotx - +
fIn x dx = x In x - x + e
xdx
x
f eax sin bx dx
= 2
eax
a
(arctan x)'
e
f
+
b
2 (a sin bx - b cos bx)
eax cos bx dx
eax
(arccotx)'
+e
=
a
2
+
b
2 (a cos bx
+
b sin bx)
+
e
Polar Coordinates
Some Constants
x = rcos 0
e = 2.71828 182845904523536
Ve = 1.64872 127070012814685
e 2 = 7.38905 60989 30650 22723
y
tanO=x
lOglO
=
7T
Series
= 0.49714987269413385435
=
I
1.14472 98858 4940017414
IOglO e = 0.434294481903251 82765
In 10 = 2.30258 50929 94045 68402
In
7T
v'2 =
= rdrdO
dxdy
3.14159265358979323846
~ = 9.86960 440lO 89358 61883
y:;;:- = 1.77245385090551602730
7T
y=rsinO
00
(Ixl
- - = ~ xm
1- x
<
1)
m~O
1.41421 356237309504880
-{Y2 = 1.25992 1049894873 16477
v'3 =
V'3 =
•
1.73205 08075 68877 29353
1.44224 95703 07408 38232
In 2 = 0.69314718055994530942
In 3 = 1.09861 228866810969140
cos x =
Alpha
IJ
Nu
p
Beta
g
Xi
Gamma
0
Omicron
Delta
7T
Pi
Epsilon
p
Rho
e
€,
~
Zeta
T}
Eta
0, tt,
e
Theta
L
Iota
K
Kappa
A,A
fL
cr, L
T
v, Y
l)!
(_I)mx 2m
~
(2m)!
m~O
xm
In (1 - x) = - ~ m
00
(Ixl
< 1)
m~l
(_l)m 2m+1
x
00
arctan x = ~
+
2m
m=O
(Ixl <
I
1)
Vectors
0'
0, !J.
+
(2m
00
Greek Alphabet
r
_l)mx 2m+ I
(
=~
m~O
'Y = 0.57721566490153286061
In 'Y = -0.549539312981644 82234
(see Sec. 5.6)
0
1 = 0.01745 32925 19943 29577 rad
1 rad = 57.29577 95130 82320 87680 0
= 57° 17' 44.806"
'Y,
00
Sill X
Sigma
a-b = alb l + a2b2 + a3b3
j
k
axb= al
a2
a3
bl
b2
b3
i
af
af
at
ax
ay
az
gradf = Vf = - i + - j + - k
Tau
Upsilon
.
cP. cp,<I> Phi
X
Chi
Lambda
I/J, 'It
Psi
Mu
w,n
Omega
aVl
diV v = V-v = -
curl v = V x v =
ax
aV2
+ -
ay
i
j
a
-
a
-
ax
ay
V2
VI
aV 3
+ -
az
k
a
az
V3