George S. Fishman
Monte Carlo
Concepts, Algorithms,
and Applications
With 98 Illustrations
6 SpringerGeorge S. Fishman
Department of Operations Research
University of North Carolina
Chapel Hill, NC 27599-3180
USA
Series Edi
Peter Glynn
Department of Operations Research
Stanford University
Stanford, CA 94305
USA.
Library of Congress Cataloging-in-Publication Data
Fishman, George 8.
Monte Carlo: concepts, algorithms, and applications / George S.
. cm. — (Springer series in operations research)
Includes bibliographical references and index.
ISBN 0-387-94527-X (alk. paper)
1. Monte Carlo method. I. Title. II Series.
QA29R.F57 1995
$19.282—de20 95-17144
Printed on acid-free paper.
© 1996 Springer-Verlag New York, Inc.
Al rights reserved. This work may not be translated or copied in whole or in part without the written
‘permission of the publisher (Springer-Verlag New York, Inc., 175 Fifth Avenue, New York, NY 10010,
USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with
any form of information storage and retricval,clectronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed is forbidden.
The use of general descriptive names, trade names, ttadematks, etc, in this publication, even if the
former are not especially identified, is not to be taken as a sign that such names, as understood by the
rade Iiarks und ivirciuandise iiarks Act, may accordingly be used freely by anyone.
Production coordinated by Publishing Network and managed by Bill Imbornoni;
‘manufacturing supervised by Joe Quatela.
‘Typeset by Asco Trade Typesetting Ltd, Hong Kong.
Printed and bound by R.R. Donnelley and Sons, Harrisonburg, VA.
Printed in the United States of America,
9876543 (Comected third printing, 1999)
ISBN 0-387-94527-X Springer-Verlag New York Berlin Heidelberg, SPIN 10689343Preface
This book provides an introduction to the Monte Carlo method suitable for a
one- or two-semester course for graduate and advanced undergraduate students in
the mathematical and engineering sciences. It also can serve as a reference for the
professional analyst. In the past, my inability to provide students with a single-
source book on this topic for class and for later professional reference had left me
repeatedly frustrated, and eventually motivated me to write this book.
In addition to focused accounts of major topics, the book has two unifying
themes: One concerns the effective use of information and the other concerns error
control and reduction. Ihe book describes how to incorporate information about
a problem into a sampling plan in a way that reduces the cost of estimating its
solution to within a specified error bound. Although exploiting special structures
to reduce cost long has been a hallmark of the Monte Carlo method, the propen-
sity of users of the method to discard useful information because it docs not fit
traditional textbook models repeatedly has impressed me. The present account
aims at reducing the impediments to integrating this information.
Errors, both statisticai and computationai, abound in every Monte Cario sam-
pling experiment, and a considerable methodology exists for controlling them.
However, error is not an appealing topic, especially in an age of easily accessible
software. Many conversations with students and professional users of the Monte
Carlo method have made me acutely aware of the seductive quality of the rapidly
developing user-friendly software for performing sampling experiments: If it exe-
cutes and produces a numerical result, why not use it? To serve a useful purpose,
a book on the Monte Carlo method should sensitize students to the potentialvi
Preface
sources of error that lurk in every Monte Carlo sampling experiment and encour-
age them to recognize the limitations that off-the-shelf software may unwittingly
impose. Moreover, the book should provide techniques for reducing and, if pos-
sible, eliminating error. The present account does this for both statistical and
computional errors.
This concern for error and cost containment has led to several departures from
standard advice given to potential users. Chapter 2 docs this with regard to error
assessment when independent replications generate the sample data. In particular,
the traditional reliance on asymptotic normal theory inevitably induces an error of
approximation in reporting statistical error. To remove this error, when possible,
the book describes distribution-free methods for statistical error reporting. While
these lead to a higher cost for achieving a specified error bound, they offer the
benefit of eliminating the need to answer the often vexing question: How large a
sample size is needed to make the error of normal approximation negligible?
Doubtlessly, some readers will react with skepticism to this emphasis. After all,
standard asymptotic normal theory for iid. data provides a convenient basis
for assessing statistical crror as the sample size n grows without bound. However,
nis always finite in practice, and it is well understood that convergence rates can
vary widely in different problems, leading to widely varying errors of approxima-
tion. When they apply, the distribution-free techniques eliminate these errors of
approximation.
While iid. sampling represents a fundamental component of the Monte Carlo
method, a substantial proportion of the problems to which the method is applied
call for generating sample paths based on Markov chain formulations. Thesc paths
of dependent observations provide the data for parameter estimation. When I
began work on this manuscript about nine years ago, the literature on Monte
Carlo Markov chain sampling experiments had been growing at a relatively slow
pace since the initial flurry of publications on this topic following World War IT.
During these nine years, the rate of publication on this topic has increased substan-
tially, and a major portion of my time has been devoted to identifying those new
developments that I could weave into a coherent account accessible to a newcomer
to the Monte Carlo method. These developments include sampling techniques (¢.g.,
Gibbs sampling), convergence analysis, computational complexity, and estimating
statistical accuracy from data on a single sample path and on multiple sample
paths. Chapter 5 first addresses the conceptual issues and then Ch. 6 turns to
the computational considerations. As writing progressed, Ch. 5 seemed to grow
without bound. This was a consequence of the many new ideas that were published
and the depth of analysis to which each had been exposed. Although I culled many
topics and reduced the space devoted to others, such as simulated annealing,
Ch. 5 remains the most substantial of the chapters. The conceptual foundations of
contemporary Monte Carlo Markov chain sampling arc organized and presented
there in a manner that allows readers to skip over some sections which may be of
lesser interest to them without losing sight of the chapter’s underlying theme.
The account of computational issues in Ch. 6 also called for selectivity. Recently,Preface
many papers have appeared with proposals for diluting the influence of initial
conditions on sample path data. These augment an existing literature on discrete-
event simulation that addresses this problem. Much reading, testing, and ponder-
ing led me to conclude that a theoretically well-supported and computationally
feasible methodology for solving this problem remains for future development.
Accordingly, the description of how to adjust for the influence of initial conditions
relies heavily on easily implementable, but admittedly subjective, graphical analy.
ses. Ideally, one hopes that more objective criteria grounded, for example, in
statistical hypothesis testing eventually will complement graphical interpretation
as a basis for choosing a warm-up interval.
By contrast, theoretically justified and computationally feasible methods for
computing confidence intervals based on single and multiple sample path data
Jong have been available in the time-series and discrete-event simulation literature.
However, the considerable number of careful decisions, based on theoretical and
computational considerations that successful implementation requires, has limited
the use of these techniques in practice. To overcome this limitation, Ch. 6 focuses
‘on the batch means method, a conceptually simple methodology, and describes an
implementation that removes many of the impediments. Software for employing
this method in the context of long sample paths is available by anonymous file
transfer protocol (ip) as described in Sec. 1.2.
Towe debts of gratitude to many for their help. I have repeatedly henefitted from
conversations with my colleagues, Scott Provan and Sandy Stidham, at Chapel
Hill. They have patiently listened to my ideas, corrected my mistaken impressions,
and introduced me to an everincreasing range of concepts that have substantially
broadened the account in this book. Colleagues at other institutions also con-
tributed. Ernst Stadlober read Ch. 3 on sampling and gave me the benefit of
his comments. Masinori Fushimi, Pierre L’Ecuyer, and Harald Niederreiter did
likewise for Ch. 7 on pseudorandom number generation. David Goldsman gave
me comments on an early version of Ch. 2, and Christos Alexopoulos provided
comments on Cis. 5 and 6, Laurence Baxter provided comments on a drait of the
manuscript. Lothar Afflerbach, Michael Ball, Wolfgang Hérmann, Pierre L’Ecuyer,
and Douglas Shier provided a variety of computations, tables, and graphs that I
have incorporated into the book. | am sincerely grateful to all of them.
My severest critics were my students. Thcy identified many typographical crrors
and nonsequiturs that called for clarification. To Xiaotao Huang, Anupama
Narayanan, Seokoon Yun, Srinivas Rajagopal, Thomas Reid, Lindsey Puryear,
aud Brit Richi, i express my sinvcre appreviaiion, Cristina Arguelies and Steve
Yarberry also are owed a special debt. In addition to their critiques of the material,
Cristina performed virtually all of the hands-on exercises that appear at the ends
of Chs. 2 through 7, Steve worked many of the analytical exercises, and both
provided innumerable tables and graphs. Moreover, Steve collaborated with me
on the development of the LABATCH software for computing confidence inter-
vals for sample path data. He also programmed and tested it. To acknowledge
Cristina’s and Steve's assistance gives me grcat pleasure.Preface
Kathleen Dulaney, Barbara Meadows, and Joan Stamper typed innumerable
revisions of the manuscript. Their exemplary patience has enabled me to refine
my thoughts on many subjects over these several years with the assurance that
Kathleen, Barbara and Joan would indulge me in yet another revision. I am
grateful to them for all of their help.
During the 1990-1991 academic year, I was on leave at the Institute of Statistics
and Decision Sciences at Duke University. The leave enabled me to lay out Chs. 5,
6, and 7 unimpeded by the many distractions that regularly came my way at
UNC-CH. I am grateful to John Geweke, the former head of the Institute, for
making space available for me; to Mike West, then the new head, for being so
hospitable; and to the entire faculty, Don Burdick, Valen Johnson, Michael Lavine,
Jean Francois Richard, and Robert Wolpert, for their congeniality and intellectual
stimulation.
Past experience has taught me that converting a manuscript to a published book
can be frustrating. This was not the case this time. Martin Gilchrist and Bill
Imbornoni of Springer-Verlag consistently were responsive to my concerns and
regularly made constructive suggestions. Elise Oranges of Publishing Network
made editing as painless for me as it could be. I am grateful to Martin, Bill, and
Elise for all their help.Contents
Preface .. 4
Selected Notation .........
1 Introduction
1.1 About This Bock
1.1.1 Controlling Error
1.1.2 Strategies For Reading This Book
1.13 Intended Audience
1.2 Available Software
1.2.1 Unix Machines
1.2.2 IBM Compatible Personal Computers
1.3 What This Book Does Not Contain .
1.4 Conventions ..
References .......
24 Volume ..
22 Error and Sample Size Considerations
2.3 Confidence Imervals .
24 Exploiting Regional Bounds .
2.4.1 Bounds on Volume
2.4.2 Worst-Case and Best-Casc Sample Sizes
2.43 Worst-Case Normal Error ......-...Contents
244 Hyperrectangular Bounds ......
24.5 Modifying the Sampling Distribution
2.8 Relative Error
25.1 Exponential Sample Size .
26 Network Reliability ..
2.7 Multivariable Integration
2.7.1 Quadrature Formulas
2.72 Equidistributed Points
2.7.3 Monte Carlo Sampling
28 Exploiting Function Bounds .
29 Exploiting Parameter Bounds
2.10 Restricting the Sampling Region
2.11 Reducing the Sampling Dimension
2.12 Counting Problems .
2.13 Sensitivity Analysis
2.13. Worst-Case and Best ‘Case Sample Sizes for Absolute Error.
213..b Best Case
2.13.2 Example .
2.13.3 Worst-Case and Best-Case Sample Sizes for Relative Error
2.134 Example .
2.14 Simultaneous Confidence Intervals .
2.15 Ratio Estimation
2.16 Sequential Estimation .
2.16.1 Absolute Error .
2.162 Relative Error
2.163 Mixed Error Criteria
3.1 Independence and Dependence
3.2 Inverse Transform Method
3.2.1 Continuous Distributions .
3.2.2 Restricted Sampling
3.23 Preserving Monotonicity
3.24 Discrete Distributions
33 Cutpoint Method
3.3.1 Restricted Sampling
3.32 The Case of Large b — a
34 Composition Method
3.5. Alias Method _
35.1 One or Two Uniform Deviates
3.5.2 Setting Up the Tables...
3.5.3 Comparing the Cutpoint and Alias Methods
42
st
35
35
65
65
70
nR
15
1
8
82
87
93
95
100
103
103,
108
417
118
120
121
122
127
143,
145
147
149
150
152
152
153
158
156
158
158
160
165
167
167
1693.6 Acceptance-Rejection Method .
Example 3.1 coe
26.1 Squeeze Method
Example 3.2
3.6.2 Avoiding Logarithmic Evaluations
3.63 Theory and Practice
3.7. Ratio-of-Uniforms Method
Example 3.3
38 Exact—Approximation Method
3.9. Algorithms for Selected Distributions
3.10 Exponential Distribution
3.11 Normal Distribution ..
3.12 Lognormal Distribution
3.13 Cauchy Distribution
3.14 Gamma Distribution
asl
3.15 Beta Distribution .....
Max ii
1
min(a, B) < 1 and max(a,f)>1 .
3.16 Student's t Distribution ....
3.17 Snedecor’s F Distribution
3.18 Revisiting the Ratio-of-Uniforms Method
43.19 Poisson Distribution.
3.20 Binomial Distribution
3.21 Hypergeometric Distribution
3.22 Geometric Distribution
3.23 Negative Binomial Distribution
3.24 Multivariate Normal Distribution
3.25 Mullinomial Distribution.
3.26 Order Statistics ...
3.26.1 Generating the Smallest or Largest Order Statistics
3.27 Sampling Without Replacement and Permutations
3.27.1 Generating k Out of n with Unequal Weights
3.28 Points in and on a Simplex .
328.1 Points in 4(6)\%(a) for b= a> 0
3.28.2 Convex Polytopes
3.29 Points in and on a Hyperellipsoid .
320 Ri Triale
3.31 Sampling from a Changing Probability Table .....
3.32 Random Spanning Trees
Exercises... :
References...
Increasing Efficiency .
4,1, Importance Sampling
4.1.1 Converting Unboundedness to Boundedness
im
12
174
17
7
178
179
182
185,
187
188
189
192
192
193
194
201
203
205
207
208
zit
215
218
221
22
23
224
226
229
230
231
232
233
233
234
25
236
241
242
251
255
257
259xii Contents
4.1.2. Revisiting the Union Counting Problem 261
4.1.3 Exponential Change of Measure .... 265
4.1.4 Random Summations with Random Stopping Times 269
4.1.5 M/M/1 Exceedance Probability . 274
4.1.6 Sequential Probability Ratio Test. 275
42 Control Variates 217
4.2.1 Normal Control Variates .......... 281
4.22 Binary Control Variates and Stochastic Ordering | 285
4.23 Estimating the Distribution of Maximal Flow 201
43 Stratified Sampling 296
4.3.1 Sample Size Considerations . 303
43.2 Estimating a Distributional Constant 304
43.3 Confidence Intervals .......6000sseeeseeeeeenens 308
43.4 Poststratified Sampling .....6.6...200000eeees 309
44° Inducing Correlation . ail
4.4.1 Estimation in One 34
443 Penalty for Misuse . 318
4.4.4 More Than Two Replications 319
4.5 Conditonal Monte Carlo 321
Exercises . ae 325
References . 332
5 Randum Tours
S.1_ Markov Processes.
$2. Random Walk :
5.21 Random Walk on a Lattice
$2.2 Normal Increments and Brownian Motion
5.3. Markov Time . :
54. Score Provesses
5.5. Neutron Transport. cee
5.6 Bufler Exceedance on a Production Line ............. _—
5.7 Fredholm Equations of the Second Kind .........2s00e00seeesseeeee
58 Catastrophic Failure
59. First Passage Time
5.10 Random Stopping Time .
5.11 Generating Random Points from a Target Distribution .
5.12 Generating a Uniformly Distributed Point on a Finite Set
513 Generating All Canrdinates ina Raninded Ragian an Fach Step
5.13.1 Uniform Generation in a Convex Polytope .
5.14 Metropolis Method ...
5.14.1 Continuous Case...
5.15 Sampling Coordinates One at a Time .
5.15.1 Randomly Selected Coordinates
5.15.1. Contingency Tables
5.15.2 Selecting Coordinates in a Fixed Order
5.16 Markov Random Fields ...
338
343
345
346
347
347
347
353
358
363
368
371
376
378
380
384
384
388
388
389
392
394
396Contents
5.16.1 Gibbs Distribution .
5.16.2 Ising Model
5.17 Gibbs Sampling
5.18 Simulated Annealing
5.19 Bayesian Posterior Distributions .
5.19.1 Gamma Conjugate Distributions
5.20 Edge Effects .
5.21 Time to Stationanty
5.22 Spectral Structure
5.22.1 Nonreversible Markov Chains
5.23 Bounds on Error .
5.24 Varying Initial Conditions
5.25 Random Walk on a Hypercube
5.26 Conductance .....eeecseees
5.26.1 Shifted Processes .
5.27 More About a Random Walk on a Hypercube
ve Error Bound for Stationarity
$29 Sampling fom a Hyperrectangular Grid
5.30 Sampling from a Hoperectangle
5.31 Product Estimator .
5.31.1 Optimal Assignment .
5.31.2 Markov Chain Sampling :
5.32 Estimating the Volume of a Convex Body
5.33 Estimating the Permanent
3.34 Coupling
5.35 Strong Markov Time we
5.36 Strong Stationary Dual Process .
5.36.1 Random Walk on the Integers
5.36.2 Another Walk on a Hypercube
5.37 Thresholds .........
Exercises...
References ...
Designing and Analyzing Sample Paths . besten
6. Problem Context .
6.1.1 Single Replication
6.1.2 Multiple Replications
62 _A First Approach to Computing Confidence Intervals .
62 Warm-lip Analysis
6.3.1 Starting Each Replication in the Same State
6.3.2 Estimating Path Length testes
6.3.3 Choosing no and to -
6.34 Stratifying the Assignments of Initial States .
6.3.5 Randomly Assigning Initial States...
64 Choosing a “Good” Initial State or a “Good” ro
65. Strictly Stationary Stochastic Processes.
64 Optimal Choice of Sample Path Length t and Number of Replicationsxiv
Contents
6.6.1 More General Correlation Structures .
6.62 Negativex ..
6.7 Estimating Required Sample Path Length
68 Characterizing Convergence ..
68.1 LLD. Sequences ....
68.2 $-Mixing Sequences
683 Strongly Mixing Sequences .
69 An Alternative View of var X,
6.10 Batch Means Method
6.10.1 Constant Number of Batches
6.10.2 Increasing Number of Batches
6.10.3 FNB and SORT Rules ..
6103.1 Interim Review ...
6.104 Comparing the FNB and SQRT Rules
6.10.5 LBATCH and ABATCH Rules .
6.10.51 LBATCH Rule .
6.106 Test for Correlation :
6.10.7 Comparing the FNB, SQRT, LBATCH, an and ABATCH Rules .
6.11 Batch Means Analysis Programs
6.11.1 p-Value
6.11.2 Prebatching
6.11.3 A Comparison with the Multiple Replication Approach
6.12 Regenerative Processes. :
6.121 Chain Splitting
6.13 Selecting an Optimal Acceptance Scheme for Metropolis Sampling
Exercises .
References
Generating Pscudorandom Numbers 587
7.1 Linear Recurrence Generators 589
72. Prime Modulus Generators . . 592
7.2.1 Choosing a Prime Modulus 593
7.22 Finding Primitive Roots .. 593
7.23 Sparseness and Nonuniformity . 595
Computational Efficiency . 507
7.3. Generators with M = 2° (B > 3) 598
73.1 Two's Complement ...... 00
ip 600
74 Mixed Congruential Generators . 601
7.5 Implementation and Portability “- 602
7.6 Apparent Randomness 607
7.6.1 Theory and Practice . 609
7.7 Spectral Test ........ + OM
7.8 Minimal Number of Parallel Hyperplanes 617
79 Distance Between Points . 620
740 Discrepancy ... : e21Contents
7.11 Beyer Quotient .
7.12 Empirical Assessments
7.12.1 Testing Hypothesis j .
7.122 Testing for Independence: Hy -
7.12.3 Testing for One-dimensional Uniformity: H,
7.124 Testing for Bivariate Uniformity: Hy .
7.125 Testing for Trivariate Uniformity: H
7.12.6 An Omnibus Test: Hy
7.43 Combining Linear Congruential Generators
7.13.1 Majorization
7.132 Shuffling .
133 Summing Pseudorandom Numbers
7.14 j-Step Linear Recurrence
7,15. Feedback Shift Register Generators
5.1 Distributional Properties
7.16 Generalized Feedback Shift Register Generators
7.16.1 Initializing a GFSR Sequence
7.162 Distributional Properties
7.17 Nonlinear Generators
7.17.1 Quadratic Generators .
7.17.2 Inversive Generators
Appendix
Exercises
References .
Author Index...
Subject Index
xv
627
628
629
629
630
631
632
632
634
634
637
638
645
649
655,
658
661
667
670
672
672
675
676
619Selected Notation
Chapter 2
-dge-disjoint minimal s-t cutsets
& = edge or arc set
¥ ={S,...,F,} = family of subsets called the object
set
G = network
a(@) = probability that s and ¢ are connected given fail-
u aa 4 = (10-9)
gm = [0,17 mensional unit hypercube
'N =stopping time in sequential estimation sampling
plans in Sec. 2.16
n= sample size
n(e,6,¥) = approximating normal sample size to meet (¢,5)
absolute error criterion for {(v)
(3)
ele P
absolute error criterion
nc(e,6,2) = Chebyshev sample size to meet (¢,6) absolute
error criterion for 4
ny(s,6) = worst-case distribution-free (Hoeffding) sample
size to meet (¢, 5) absolute error criterion
ny(e,0,A) = distribution-free (Hoeffding) sample size to meet
(2,8) absolute error criterion for 2
xviiSelected Notation
ny(¢, 6,4, Ay) = worst-case distribution-free (Hoeffding) sample
size to meet (c,5) absolute error criterion for
AsAshy
est-case distribution-free (Hoeffding) sample
size to meet (¢,5) absolute error criterion for
ASA Shy
jorst-case, distribution-free (Hoeffding) sample
size to meet (A¢,,) relative accuracy criterion for
AySh Shy
best-case distribution-free (Hoeffding) sample
size to meet (4e,,5) relative error criterion for
Ashshy
ny(e5) — worst-case normal size to meet (¢,5) absolute
error criterion
ny(é,5,2) = normal sample size to meet (¢,4) absolute error
criterion for 4
ny(2e,,5,2) = approximating normal sample size to meet (e,, 5)
relative error criterion for 4
ny(e,5,21,2v) = approximating worst-case normal sample size to
mect (z, 5) absolute error criterion for 2, 0, —c0 0 Vi wy +
ot Wy ed
N(Go,-ef) = first passage time from states. to a point
inf
P = |ipyllij£o = Markov transition matrix
P* = k-step transition matrix
Markov transition matrix for coordi-
nate]
PO) =P
P(0) = 61 + (1 — PO)
P(x,.f) = I-step probability of moving from x to
oF
PX(x, of) = k-step probability of moving from x to
ACS
P=D"P'D
Py = probability of moving from state i to
state j
pi? = probability of moving from state i to
siaie jin & sieps
R = nominating matrix for Metropolis sam-
pling
state on step j for one-dimensional state
vector
Sy = state of coordinate ion step j
S, = (S,,,---+Spy) = state vector on step j
Y= state space for coordinate i of the
m-dimensional state vector
{S?,i > 0} = Markov chain used for coupling
{Sj,i = 0} = strong dual Markov process
separation error for {pW,0 0} = Brownian motion
= Xm)
Rima Xttay +09 Xm)
Xi-as Vota +++9%m)
ay — probability of accepting nominated
transition from state i to state j for
Metropolis sampling
{Bo = 1> Bi = Bp =" = By-a > —1} = ordered cigenvalues of an irreducible
aperiodic reversible Markov chain P
ax( BBs)
“oincaré constant (expression (131))
quilibrium pmf.
to = initializing p.m.
m= equilibrium probability of being in
state i
n(x) = probability of being in state x
WB,P) = conductance from # to # using
transition matrix P
W(P) = global conductance using P
Chapter 6
assumption of strong stationarity
assumption of weak stationarity
B(e) = random batch size for sample path of length ¢
step on which data collection begins
L(t) = number of batches for sample path of length t
1 = number of batches
1, iff Ho is rejected on interim review j
M,= :
™* 0, otherwise
M, = proportion of rejections of Ho on interim reviews 1
Bh
n= number of replications
Rj = cov(X, Xi4))
5) = system state on step j (= 0)
1 S (xo Xe
ard AP ~ Xow
S = state space
t = sample path length used for data analysis
shaxxiv Selected Notation
V(Xns) = 2/0
{W(0) = 0; W(0),t > 0} = standard Brownian motion
Wo= 72 Ee ¥P
W,/l = batch means estimate of o?
X{? = observation j on replication i
X, = g(5)) (observation on step j)
oe DBRS
XR=7 y xP
t fe?
¥,, = ¥_? for an arbitrary replication i
sina
Xam = 5D, Xe
12
Yin = 5D Mou-ayes (batch mean j)
y=? x %
4 = convergence rate under ASA
n= n(g) = EX,
By = EXISo = 3), ¥8 0 F
Marz = E(Xp:1So = 8), VEL
1%, = equilibrium probability of being in state ic Y
‘= equilibrium distribution on ¥
ree
o2 = Ro(1 + 7) = lim to3,, VseS
ar(X|So = 3), Ws ¥
ar(Xy|So = 8), VES
Chapter 7
A =multiplier
D§?(A,M) = k-dimensional N-point discrepancy
lV
4,(q, A, M) = ( >, «) = distance between neighboring hyperplanes
d(4,M) = max” (q,4,M)
a6 2K4.a)
M = modulus
N,(q, 4,M) = number of parallel hyperplanes
N(4,M)= max (q,4,M)
gepiaan
N@A,M) = ¥ lal —
upper bound on N,(q, A, M)Selected Notation xxv
Ni(A,M)= max (q,4,M)
6 4104.80)
P= period
BE
Q(A) = ¥ 4.4’ = 0 (mod (6)
4 = o,---+ 4-1)
44(4, (M)) = Beyer quotient
2,(4,M) = {4 —M 3)
U=Z/M
Z, — pseudorandom number i
Zo = seed
Dy = (Zu Zisrs---s Zire)Introduction
The Monte Carlo method provides approximate solutions to a variety of mathe-
matical problems by performing statistical sampling experiments on a computer.
Remarkably, the method applies to problems with absolutely no probabilistic
content as well as to those with inherent probabilistic structure. This alone does
not give the Monte Carlo method an advantage over other methods of approxima-
tion, However, among all numerical methods that rely on n-point evaluations
in m-dimensional space to praduce an approximate solution, the Monte Carlo
method has absolute error of estimate that decreases as n™"? whereas, in the
absence of exploitable special structure, all others have errors that decrease as n~™
at best. This property gives the Monte Carlo method a considerable edge in
computational efficiency as m, the size of the problem, increases. Combinatorial
settings illustrate this property especially well. Whereas the exact solution to a
combinatorial problem with m elements often has computational cost that i
creases exponentially or superexponentially with m, the Monte Carlo method.
frequently provides an estimated solution with tolerable error at a cost that in-
creases no faster than as a polynomial in m.
Deceptively simple in concept, the Monte Carlo method incorporates three
distinct, but related, historical developments in the mathematical sciences. First,
games of chance motivated seventeenth- and eighteenth-century mathematicians
to regard outcomes on successive trials as forming a sequence of random events.
Observing that the mean of a function of continuous random variables took
the form of an integral, nineteenth- and early-twentieth—century statisticians sub-
sequently recognized that, in principle, one could randomly draw numbers, trans-form them according to prescribed rules, and derive an approximate solution to an
integral in a problem that intrinsically contained no probabilistic content whatever
(c.g, National Bureau of Standards 1951, p. 40). In the late nineteenth century, a
second line of inquiry developed when Lord Rayleigh (1899) showed that a one-
dimensional random walk without absorbing barriers could provide an approxi-
mate solution to a parabolic differential equation. In the course of demonstrating
that under appropriate conditions a particular finite difference equation could
produce an approximate solution to the Dirichlet boundary-value problem of
partial differential equations, Courant et al. (1928) showed that the recursive form
of the solution to a two-dimensional random walk on a square grid within a closed
region whose boundary points were absorbing states produced the identical differ-
ence equation. Shortly thereafter, Kolmogorov (1931) showed the relationship
between Markov stochastic processes and certain integro-differential equations.
Petrowsky (1933) considerably generalized the result of Courant et al. by showing
the asymptotic connection between a random walk whose sequence of locations
formed a Markov chain and the solution to an elliptic partial differential equation.
He called this formulation the generalized Dirichlet problem.
At that time, these revelations derived their significance from the observation
that solutions to problems encountered in stochastic processes often corresponded
to solutions that arose in the study of partial differential equations, and, there-
fore, one could employ the difference equation methods applicable for solving the
differential equations to provide approximate solutions to the original stochastic
problem.
During the development of atomic energy in the post-World War II era, a
third line of inquiry evolved. Scientists needed to solve problems of neutron diffu-
sion or transport through an isotropic medium.! These multidimensional prob-
Jems proved too formidable for the difference equation approach. Since it had
already been established that the resulting partial differential and integral equa-
tions all had analogs in stochastic processes, John von Neumann and Stanislaw
‘Uiam suggested that sampling experiments using random waik modeis and executed
on the newly developed digital computer could provide readily usable approxima-
tions to the desired solutions. This proposal reversed the direction of reasoning.
Instead of using the cumbersome difference equation approach to provide solu-
tions to probabilistic problems, onc conducted sampling experiments to provide
solutions to the integro-differential equations, which did not necessarily have a
probabilistic basis themselves. Moreover, the availability of the digital computer
provided tie means for puiiing this suggestion into practice. Later, the concept of
employing sampling experiments on a computer came to prevail in many scientific
disciplines, most notably, in chemistry, engineering, operations research, physics,
and statistics. In operations research, studies of all but the most elementary queue-
ing models rely on generating random tours on computers to provide approximate
solutions. In physics, the Monte Carlo method has come to be recognized as the
* An isotropic medium has the same properties in all directions.